ribs.archives.SlidingBoundariesArchive¶
-
class ribs.archives.SlidingBoundariesArchive(*, solution_dim: int | integer | tuple[int | integer, ...], dims: Collection[int | integer], ranges: Collection[tuple[float | floating, float | floating]], epsilon: float | floating =
1e-06, qd_score_offset: float | floating =0.0, seed: int | integer | None =None, solution_dtype: numpy.typing.DTypeLike =None, objective_dtype: numpy.typing.DTypeLike =None, measures_dtype: numpy.typing.DTypeLike =None, dtype: numpy.typing.DTypeLike =None, extra_fields: dict[str, tuple[int | integer | tuple[int | integer, ...], DTypeLike]] | None =None, remap_frequency: int | integer =100, buffer_capacity: int | integer =1000)[source]¶ An archive with a fixed number of sliding boundaries in each dimension.
This archive is the container described in Fontaine 2019. Just like the
GridArchive, it can be visualized as an n-dimensional grid in the measure space that is divided into a certain number of cells in each dimension. Internally, this archive stores a buffer with thebuffer_capacitymost recent solutions and uses them to determine the boundaries of each dimension of the measure space. After everyremap_frequencysolutions are inserted, the archive remaps the boundaries based on the solutions in the buffer.Initially, the archive has no solutions, so it cannot automatically calculate the boundaries. Thus, until the first remap, this archive divides the measure space defined by
rangesinto equally-sized cells.Overall, this archive attempts to make the distribution of the space illuminated by the archive more accurately match the true distribution of the measures when they are not uniformly distributed.
By default, this archive stores the following data fields:
solution,objective,measures, andindex. The integerindexuniquely identifies each cell.- Parameters:¶
- solution_dim: int | integer | tuple[int | integer, ...]¶
Dimensionality of the solution space. Scalar or multi-dimensional solution shapes are allowed by passing an empty tuple or tuple of integers, respectively.
- dims: Collection[int | integer]¶
Number of cells in each dimension of the measure space, e.g.
[20, 30, 40]indicates there should be 3 dimensions with 20, 30, and 40 cells. (The number of dimensions is implicitly defined in the length of this argument).- ranges: Collection[tuple[float | floating, float | floating]]¶
Upper and lower bound of each dimension of the measure space, e.g.
[(-1, 1), (-2, 2)]indicates the first dimension should have bounds \([-1,1]\) (inclusive), and the second dimension should have bounds \([-2,2]\) (inclusive).rangesshould be the same length asdims.- epsilon: float | floating =
1e-06¶ Due to floating point precision errors, we add a small epsilon when computing the archive indices in the
index_of()method – refer to the implementation here. Pass this parameter to configure that epsilon.- qd_score_offset: float | floating =
0.0¶ Archives often contain negative objective values, and if the QD score were to be computed with these negative objectives, the algorithm would be penalized for adding new cells with negative objectives. Thus, a standard practice is to normalize all the objectives so that they are non-negative by introducing an offset. This QD score offset will be subtracted from all objectives in the archive, e.g., if your objectives go as low as -300, pass in -300 so that each objective will be transformed as
objective - (-300).- seed: int | integer | None =
None¶ Value to seed the random number generator. Set to None to avoid a fixed seed.
- solution_dtype: numpy.typing.DTypeLike =
None¶ Data type of the solutions. Defaults to float64 (numpy’s default floating point type).
- objective_dtype: numpy.typing.DTypeLike =
None¶ Data type of the objectives. Defaults to float64 (numpy’s default floating point type).
- measures_dtype: numpy.typing.DTypeLike =
None¶ Data type of the measures. Defaults to float64 (numpy’s default floating point type).
- dtype: numpy.typing.DTypeLike =
None¶ Shortcut for providing data type of the solutions, objectives, and measures. Defaults to float64 (numpy’s default floating point type). This parameter sets all the dtypes simultaneously. To set individual dtypes, pass
solution_dtype,objective_dtype, andmeasures_dtype. Note thatdtypecannot be used at the same time as those parameters.- extra_fields: dict[str, tuple[int | integer | tuple[int | integer, ...], DTypeLike]] | None =
None¶ Description of extra fields of data that are stored next to elite data like solutions and objectives. The description is a dict mapping from a field name (str) to a tuple of
(shape, dtype). For instance,{"foo": ((), np.float32), "bar": ((10,), np.float32)}will create a “foo” field that contains scalar values and a “bar” field that contains 10D values. Note that field names must be valid Python identifiers, and names already used in the archive are not allowed.- remap_frequency: int | integer =
100¶ Frequency of remapping. Archive will remap once after
remap_frequencynumber of solutions has been found.- buffer_capacity: int | integer =
1000¶ Number of solutions to keep in the buffer. Solutions in the buffer will be reinserted into the archive after remapping.
Methods
__iter__()Creates an iterator over the elites in the archive.
__len__()Number of elites in the archive.
add(solution, objective, measures, **fields)Inserts a batch of solutions into the archive.
add_single(solution, objective, measures, ...)Inserts a single solution into the archive.
clear()Removes all elites in the archive.
data()Returns data of the elites in the archive.
grid_to_int_index(grid_indices)Converts a batch of grid indices into a batch of integer indices.
index_of(measures)Returns archive indices for the given batch of measures.
index_of_single(measures)Returns the index of the measures for one solution.
int_to_grid_index(int_indices)Converts a batch of indices into indices in the archive's grid.
retrieve(measures)Queries the archive for elites with the given batch of measures.
retrieve_single(measures)Queries the archive for an elite with the given measures.
sample_elites(n[, replace])Randomly samples elites from the archive.
Attributes
The elite with the highest objective in the archive.
The dynamic boundaries of the cells in each dimension.
Maximum capacity of the buffer.
Total number of cells in the archive.
(
measure_dim,) array listing the number of cells in each dimension.Mapping from field name to dtype for all fields in the archive.
Whether the archive is empty.
Epsilon for computing archive indices.
List of data fields in the archive.
(
measure_dim,) array listing the size of each dim (upper_bounds - lower_bounds).(
measure_dim,) array listing the lower bound of each dimension.Dimensionality of the measure space.
Dimensionality of the objective space.
Subtracted from objective values when computing the QD score.
Frequency of remapping.
Dimensionality of the solution space.
Statistics about the archive.
(
measure_dim,) array listing the upper bound of each dimension.- add(solution: numpy.typing.ArrayLike, objective: numpy.typing.ArrayLike, measures: numpy.typing.ArrayLike, **fields: numpy.typing.ArrayLike) dict[str, ndarray][source]¶
Inserts a batch of solutions into the archive.
Note
Unlike in other archives, this method is not truly batched; rather, it is implemented by calling
add_single()on the solutions in the batch, in the order that they are passed in. As such, this method is not invariant to the ordering of the solutions in the batch.See
add_single()andribs.archives.GridArchive.add()for arguments and return values.
- add_single(solution: numpy.typing.ArrayLike, objective: numpy.typing.ArrayLike, measures: numpy.typing.ArrayLike, **fields: numpy.typing.ArrayLike) dict[str, Any][source]¶
Inserts a single solution into the archive.
This method remaps the archive after every
remap_frequencysolutions are added. Remapping involves changing the boundaries of the archive to the percentage marks of the measures stored in the buffer and re-adding all of the solutions stored in the buffer and the current archive.- Parameters:¶
- Returns:¶
Information describing the result of the add operation. The dict contains
statusandvaluekeys, exactly as inribs.archives.GridArchive.add().- Raises:¶
ValueError – The array arguments do not match their specified shapes.
ValueError –
objectiveis non-finite (inf or NaN) ormeasureshas non-finite values.
-
data(fields: str, return_type: 'dict' | 'tuple' | 'pandas' =
'dict') ndarray[source]¶ -
data(fields: None | Collection[str] =
None, return_type: 'dict' ='dict') dict[str, ndarray] -
data(fields: None | Collection[str] =
None, return_type: 'tuple' ='tuple') tuple[ndarray] -
data(fields: None | Collection[str] =
None, return_type: 'pandas' ='pandas') ArchiveDataFrame Returns data of the elites in the archive.
- Parameters:¶
- fields: str¶
- fields: None | Collection[str] =
None List of fields to include, such as
"solution","objective","measures", and other fields in the archive. This can also be a single str indicating a field name.- return_type: 'dict' | 'tuple' | 'pandas' =
'dict'¶ - return_type: 'dict' =
'dict' - return_type: 'tuple' =
'tuple' - return_type: 'pandas' =
'pandas' Data to return; see below. Ignored if
fieldsis a str.
- Returns:¶
The data for all elites in the archive. All data returned by this method will be a copy, i.e., the data will not update as the archive changes. If
fieldswas a single str, the returned data will just be an array holding data for the given field, such as:measures = archive.data("measures")Otherwise, the returned data can take the following forms, depending on the
return_typeargument:return_type="dict": Dict mapping from the field name to the field data at the given indices. An example is:{ "solution": [[1.0, 1.0, ...], ...], "objective": [1.5, ...], "measures": [[1.0, 2.0], ...], ... }The keys in this dict can be modified with the
fieldsarg; duplicate fields will be ignored since the dict stores unique keys.return_type="tuple": Tuple of arrays matching the field order infields. For instance, iffieldsis["objective", "measures"], this method would return a tuple of(objective_arr, measures_arr)that could be unpacked as:objective, measures = archive.data(["objective", "measures"], return_type="tuple")Unlike with the
dictreturn type, duplicate fields will show up as duplicate entries in the tuple, e.g.,fields=["objective", "objective"]will result in two objective arrays being returned.When
fields=None(the default case), the fields in the tuple will be ordered according to thefield_list.return_type="pandas": AnArchiveDataFramewith the following columns:For fields that are scalars, a single column with the field name. For example,
objectivewould have a single column calledobjective.For fields that are 1D arrays, multiple columns with the name suffixed by its index. To illustrate, for a
measuresfield of length 10, the dataframe would contain 10 columns with namesmeasures_0,measures_1, …,measures_9. The output format for fields with >1D data is currently not defined.
In short, the dataframe might look like this by default:
solution_0
…
objective
measures_0
…
…
…
Like the other return types, the columns returned can be adjusted with the
fieldsparameter.
- Raises:¶
ValueError – Invalid field name provided.
ValueError – Invalid return_type provided.
ValueError – Passed
return_type="pandas"when one of the fields has >1D data.
- grid_to_int_index(grid_indices: numpy.typing.ArrayLike) ndarray¶
Converts a batch of grid indices into a batch of integer indices.
Refer to
index_of()for more info.- Parameters:¶
- grid_indices: numpy.typing.ArrayLike¶
(batch_size,
measure_dim) array of indices in the archive grid.
- Returns:¶
(batch_size,) array of integer indices.
- Raises:¶
ValueError –
grid_indicesis not of shape (batch_size,measure_dim).
- index_of(measures: numpy.typing.ArrayLike) ndarray[source]¶
Returns archive indices for the given batch of measures.
First, values are clipped to the bounds of the measure space. Then, the values are mapped to cells via a binary search along the boundaries in each dimension.
At this point, we have “grid indices” – indices of each measure in each dimension. Since indices returned by this method must be single integers (as opposed to a tuple of grid indices), we convert these grid indices into integer indices with
numpy.ravel_multi_index()and return the result.It may be useful to have the original grid indices. Thus, we provide the
grid_to_int_index()andint_to_grid_index()methods for converting between grid and integer indices.As an example, the grid indices can be used to access boundaries of a measure value’s cell. For example, the following retrieves the lower and upper bounds of the cell along dimension 0:
# Access only element 0 since this method operates in batch. idx = archive.int_to_grid_index(archive.index_of(...))[0] lower = archive.boundaries[0][idx[0]] upper = archive.boundaries[0][idx[0] + 1]See
boundariesfor more info.- Parameters:¶
- measures: numpy.typing.ArrayLike¶
(batch_size,
measure_dim) array of coordinates in measure space.
- Returns:¶
(batch_size,) array of integer indices representing the flattened grid coordinates.
- Raises:¶
ValueError –
measuresis not of shape (batch_size,measure_dim).
- index_of_single(measures: numpy.typing.ArrayLike) int | integer[source]¶
Returns the index of the measures for one solution.
See
index_of().- Parameters:¶
- measures: numpy.typing.ArrayLike¶
(
measure_dim,) array of measures for a single solution.
- Returns:¶
Integer index of the measures in the archive’s storage arrays.
- Raises:¶
ValueError –
measuresis not of shape (measure_dim,).ValueError –
measureshas non-finite values (inf or NaN).
- int_to_grid_index(int_indices: numpy.typing.ArrayLike) ndarray¶
Converts a batch of indices into indices in the archive’s grid.
Refer to
index_of()for more info.- Parameters:¶
- int_indices: numpy.typing.ArrayLike¶
(batch_size,) array of integer indices such as those output by
index_of().
- Returns:¶
(batch_size,
measure_dim) array of indices in the archive grid.- Raises:¶
ValueError –
int_indicesis not of shape (batch_size,).
- retrieve(measures: numpy.typing.ArrayLike) tuple[ndarray, dict[str, ndarray]][source]¶
Queries the archive for elites with the given batch of measures.
This method operates in batch. It takes in a batch of measures and outputs the batched data for the elites:
occupied, elites = archive.retrieve(...) occupied # Shape: (batch_size,) elites["solution"] # Shape: (batch_size, solution_dim) elites["objective"] # Shape: (batch_size, objective_dim) elites["measures"] # Shape: (batch_size, measure_dim) ...occupiedindicates whether an elite was found for each measure, i.e., whether the archive was occupied at each queried measure. Ifoccupied[i]is True, thenelites["solution"][i],elites["objective"][i],elites["measures"][i], and other fields will contain the data of the elite for the inputmeasures[i]. Ifoccupied[i]is False, then those fields will instead have arbitrary values, e.g.,elites["solution"][i]may be set to all NaN.- Parameters:¶
- measures: numpy.typing.ArrayLike¶
(batch_size,
measure_dim) array of measure space points at which to retrieve solutions.
- Returns:¶
2-element tuple of (boolean
occupiedarray, dict of elite data). See above for description.- Raises:¶
ValueError –
measuresis not of shape (batch_size,measure_dim).ValueError –
measureshas non-finite values (inf or NaN).
- retrieve_single(measures: numpy.typing.ArrayLike) tuple[bool, dict[str, Any]][source]¶
Queries the archive for an elite with the given measures.
While
retrieve()takes in a batch of measures, this method takes in the measures for only one solution and returns a single bool and a dict with single entries:occupied, elite = archive.retrieve_single(...) occupied # Bool elite["solution"] # Shape: (solution_dim,) elite["objective"] # Shape: (objective_dim,) elite["measures"] # Shape: (measure_dim,) ...- Parameters:¶
- measures: numpy.typing.ArrayLike¶
(
measure_dim,) array of measures.
- Returns:¶
2-element tuple of (boolean, dict of data for one elite)
- Raises:¶
ValueError –
measuresis not of shape (measure_dim,).ValueError –
measureshas non-finite values (inf or NaN).
-
sample_elites(n: int | integer, replace: bool =
True) dict[str, ndarray][source]¶ Randomly samples elites from the archive.
Currently, this sampling is done uniformly at random, either with or without replacement. Additional sampling methods may be supported in the future.
Example
elites = archive.sample_elites(16) elites["solution"] # Shape: (16, solution_dim) elites["objective"] elites["measures"] ...- Parameters:¶
- Returns:¶
A batch of elites randomly selected from the archive.
- Raises:¶
IndexError – The archive is empty.
ValueError –
nwas greater than the number of elites in the archive whenreplace=False.
- property best_elite : dict[str, Any]¶
The elite with the highest objective in the archive.
None if there are no elites in the archive.
- property boundaries : list[ndarray]¶
The dynamic boundaries of the cells in each dimension.
Entry
iin this list is an array that contains the boundaries of the cells in dimensioni. The array containsself.dims[i] + 1entries laid out like this:Archive cells: | 0 | 1 | ... | self.dims[i] | boundaries[i]: 0 1 2 self.dims[i] - 1 self.dims[i]Thus,
boundaries[i][j]andboundaries[i][j + 1]are the lower and upper bounds of celljin dimensioni. To access the lower bounds of all the cells in dimensioni, useboundaries[i][:-1], and to access all the upper bounds, useboundaries[i][1:].
- property dims : ndarray¶
(
measure_dim,) array listing the number of cells in each dimension.
- property interval_size : ndarray¶
(
measure_dim,) array listing the size of each dim (upper_bounds - lower_bounds).
- property lower_bounds : ndarray¶
(
measure_dim,) array listing the lower bound of each dimension.
- property objective_dim : tuple[()] | int | integer¶
Dimensionality of the objective space.
The empty tuple
()indicates a scalar objective.
- property remap_frequency : int | integer¶
Frequency of remapping.
The archive will remap once after
remap_frequencynumber of solutions has been found.
- property solution_dim : int | integer | tuple[int | integer, ...]¶
Dimensionality of the solution space.
- property stats : ArchiveStats¶
Statistics about the archive.
See
ArchiveStatsfor more info.
- property upper_bounds : ndarray¶
(
measure_dim,) array listing the upper bound of each dimension.