ribs.archives.ArchiveBase

class ribs.archives.ArchiveBase(*, solution_dim: int | integer | tuple[int | integer, ...], objective_dim: tuple[()] | int | integer, measure_dim: int | integer)[source]

Base class for archives.

An archive stores elites. Each elite consists of several data fields: at a minimum, the elite has a solution and the evaluated objective and measures of the solution. The elite may also include additional data fields. Besides elites, archives can store components like k-D trees and density estimators.

The primary method of an archive is to write new solutions to it with add(). There are also methods to read from the archive, such as retrieve() and data(). These methods typically operate over batches of inputs (e.g., adding multiple solutions at once with add()), but methods such as add_single() and retrieve_single() support single inputs.

Due to the flexibility of workflows available in pyribs, it is possible to design archives that require only a small subset of the methods in this base class. As such, none of the methods listed here are required to be implemented in child classes, although by default they will raise NotImplementedError when called.

Parameters:
solution_dim: int | integer | tuple[int | integer, ...]

Dimensionality of the solution space. Scalar or multi-dimensional solution shapes are allowed by passing an empty tuple or tuple of integers, respectively.

objective_dim: tuple[()] | int | integer

Dimensionality of the objective space. For single-objective optimization problems where the objective is a scalar, this argument should be an empty tuple (). In multi-objective optimization problems, this argument should be an integer indicating the number of objectives.

measure_dim: int | integer

Dimensionality of the measure space.

Methods

__iter__()

Creates an iterator over the elites in the archive.

__len__()

Number of elites in the archive.

add(solution, objective, measures, **fields)

Inserts a batch of solutions and their data into the archive.

add_single(solution, objective, measures, ...)

Inserts a single solution and its data into the archive.

clear()

Resets the archive, e.g., by removing all elites in it.

data()

Returns data of the elites in the archive.

retrieve(measures)

Queries the archive for elites with the given batch of measures.

retrieve_single(measures)

Queries the archive for an elite with the given measures.

sample_elites(n[, replace])

Randomly samples elites from the archive.

Attributes

dtypes

Mapping from field name to dtype for all fields in the archive.

empty

Whether the archive is empty.

field_list

List of data fields in the archive.

measure_dim

Dimensionality of the measure space.

objective_dim

Dimensionality of the objective space.

solution_dim

Dimensionality of the solution space.

stats

Statistics about the archive.

__iter__() Iterator[dict[str, Any]][source]

Creates an iterator over the elites in the archive.

Example

for elite in archive:
    elite["solution"]
    elite["objective"]
    elite["measures"]
    ...
__len__() int[source]

Number of elites in the archive.

add(solution: numpy.typing.ArrayLike, objective: numpy.typing.ArrayLike, measures: numpy.typing.ArrayLike, **fields: numpy.typing.ArrayLike) dict[str, ndarray][source]

Inserts a batch of solutions and their data into the archive.

The indices of all arguments should “correspond” to each other, i.e., solution[i], objective[i], and measures[i] should be the solution parameters, objective, and measures for solution i.

For API consistency, all child classes should take in solution, objective, and measures. There may be cases where one of these parameters is not necessary, e.g., objective is not required in diversity optimization settings. In such cases, it should be possible to pass in None as the argument.

Parameters:
solution: numpy.typing.ArrayLike

(batch_size, solution_dim) array of solution parameters.

objective: numpy.typing.ArrayLike

(batch_size, objective_dim) array with objective function evaluations of the solutions.

measures: numpy.typing.ArrayLike

(batch_size, measure_dim) array with measure space coordinates of all the solutions.

**fields: numpy.typing.ArrayLike

Additional data for each solution. Each argument should be an array with batch_size as the first dimension.

Returns:

Dict describing the result of the add operation. Each entry should be an array that provides the information for each solution, e.g., one entry might be a “status” array of shape (batch_size,) that provides the status of each solution. The exact keys and values are determined by child classes.

add_single(solution: numpy.typing.ArrayLike, objective: numpy.typing.ArrayLike, measures: numpy.typing.ArrayLike, **fields: numpy.typing.ArrayLike) dict[str, Any][source]

Inserts a single solution and its data into the archive.

Parameters:
solution: numpy.typing.ArrayLike

Parameters of the solution.

objective: numpy.typing.ArrayLike

Objective function evaluation of the solution.

measures: numpy.typing.ArrayLike

Coordinates in measure space of the solution.

**fields: numpy.typing.ArrayLike

Additional data for the solution.

Returns:

Information describing the result of the add operation. As in add(), the content of this dict is determined by child classes.

clear() None[source]

Resets the archive, e.g., by removing all elites in it.

After calling this method, the archive should be empty.

data(fields: str, return_type: 'dict' | 'tuple' | 'pandas' = 'dict') ndarray[source]
data(fields: None | Collection[str] = None, return_type: 'dict' = 'dict') dict[str, ndarray]
data(fields: None | Collection[str] = None, return_type: 'tuple' = 'tuple') tuple[ndarray]
data(fields: None | Collection[str] = None, return_type: 'pandas' = 'pandas') ArchiveDataFrame

Returns data of the elites in the archive.

Parameters:
fields: str
fields: None | Collection[str] = None

List of fields to include, such as "solution", "objective", "measures", and other fields in the archive. This can also be a single str indicating a field name.

return_type: 'dict' | 'tuple' | 'pandas' = 'dict'
return_type: 'dict' = 'dict'
return_type: 'tuple' = 'tuple'
return_type: 'pandas' = 'pandas'

Data to return; see below. Ignored if fields is a str.

Returns:

The data for all elites in the archive. All data returned by this method will be a copy, i.e., the data will not update as the archive changes. If fields was a single str, the returned data will just be an array holding data for the given field, such as:

measures = archive.data("measures")

Otherwise, the returned data can take the following forms, depending on the return_type argument:

  • return_type="dict": Dict mapping from the field name to the field data at the given indices. An example is:

    {
      "solution": [[1.0, 1.0, ...], ...],
      "objective": [1.5, ...],
      "measures": [[1.0, 2.0], ...],
      ...
    }
    

    The keys in this dict can be modified with the fields arg; duplicate fields will be ignored since the dict stores unique keys.

  • return_type="tuple": Tuple of arrays matching the field order in fields. For instance, if fields is ["objective", "measures"], this method would return a tuple of (objective_arr, measures_arr) that could be unpacked as:

    objective, measures = archive.data(["objective", "measures"],
                                       return_type="tuple")
    

    Unlike with the dict return type, duplicate fields will show up as duplicate entries in the tuple, e.g., fields=["objective", "objective"] will result in two objective arrays being returned.

    When fields=None (the default case), the fields in the tuple will be ordered according to the field_list.

  • return_type="pandas": An ArchiveDataFrame with the following columns:

    • For fields that are scalars, a single column with the field name. For example, objective would have a single column called objective.

    • For fields that are 1D arrays, multiple columns with the name suffixed by its index. To illustrate, for a measures field of length 10, the dataframe would contain 10 columns with names measures_0, measures_1, …, measures_9. The output format for fields with >1D data is currently not defined.

    In short, the dataframe might look like this by default:

    solution_0

    objective

    measures_0

    Like the other return types, the columns returned can be adjusted with the fields parameter.

Raises:
  • ValueError – Invalid field name provided.

  • ValueError – Invalid return_type provided.

  • ValueError – Passed return_type="pandas" when one of the fields has >1D data.

retrieve(measures: numpy.typing.ArrayLike) tuple[ndarray, dict[str, ndarray]][source]

Queries the archive for elites with the given batch of measures.

This method operates in batch. It takes in a batch of measures and outputs the batched data for the elites:

occupied, elites = archive.retrieve(...)
occupied  # Shape: (batch_size,)
elites["solution"]  # Shape: (batch_size, solution_dim)
elites["objective"]  # Shape: (batch_size, objective_dim)
elites["measures"]  # Shape: (batch_size, measure_dim)
...

occupied indicates whether an elite was found for each measure, i.e., whether the archive was occupied at each queried measure. If occupied[i] is True, then elites["solution"][i], elites["objective"][i], elites["measures"][i], and other fields will contain the data of the elite for the input measures[i]. If occupied[i] is False, then those fields will instead have arbitrary values, e.g., elites["solution"][i] may be set to all NaN.

Parameters:
measures: numpy.typing.ArrayLike

(batch_size, measure_dim) array of measure space points at which to retrieve solutions.

Returns:

2-element tuple of (boolean occupied array, dict of elite data). See above for description.

Raises:
retrieve_single(measures: numpy.typing.ArrayLike) tuple[bool, dict[str, Any]][source]

Queries the archive for an elite with the given measures.

While retrieve() takes in a batch of measures, this method takes in the measures for only one solution and returns a single bool and a dict with single entries:

occupied, elite = archive.retrieve_single(...)
occupied  # Bool
elite["solution"]  # Shape: (solution_dim,)
elite["objective"]  # Shape: (objective_dim,)
elite["measures"]  # Shape: (measure_dim,)
...
Parameters:
measures: numpy.typing.ArrayLike

(measure_dim,) array of measures.

Returns:

2-element tuple of (boolean, dict of data for one elite)

Raises:
sample_elites(n: int | integer, replace: bool = True) dict[str, ndarray][source]

Randomly samples elites from the archive.

Currently, this sampling is done uniformly at random, either with or without replacement. Additional sampling methods may be supported in the future.

Example

elites = archive.sample_elites(16)
elites["solution"]  # Shape: (16, solution_dim)
elites["objective"]
elites["measures"]
...
Parameters:
n: int | integer

Number of elites to sample.

replace: bool = True

Whether to replace the elites when sampling. If True, the elites will be replaced and thus will be sampled independently.

Returns:

A batch of elites randomly selected from the archive.

Raises:
  • IndexError – The archive is empty.

  • ValueErrorn was greater than the number of elites in the archive when replace=False.

property dtypes : dict[str, dtype]

Mapping from field name to dtype for all fields in the archive.

property empty : bool

Whether the archive is empty.

property field_list : list[str]

List of data fields in the archive.

property measure_dim : int | integer

Dimensionality of the measure space.

property objective_dim : tuple[()] | int | integer

Dimensionality of the objective space.

The empty tuple () indicates a scalar objective.

property solution_dim : int | integer | tuple[int | integer, ...]

Dimensionality of the solution space.

property stats : ArchiveStats

Statistics about the archive.

See ArchiveStats for more info.