ribs.archives.ProximityArchive¶
-
class ribs.archives.ProximityArchive(*, solution_dim: int | integer | tuple[int | integer, ...], measure_dim: int | integer, k_neighbors: int | integer, novelty_threshold: float | floating, local_competition: bool =
False, initial_capacity: int | integer =128, qd_score_offset: float | floating =0.0, seed: int | integer | None =None, solution_dtype: numpy.typing.DTypeLike =None, objective_dtype: numpy.typing.DTypeLike =None, measures_dtype: numpy.typing.DTypeLike =None, dtype: numpy.typing.DTypeLike =None, extra_fields: dict[str, tuple[int | integer | tuple[int | integer, ...], DTypeLike]] | None =None, kdtree_kwargs: dict | None =None, ckdtree_kwargs: None =None)[source]¶ An archive that adds new solutions based on novelty.
This archive originates in Novelty Search and is described in Lehman 2011. Solutions are added to the archive if their novelty exceeds a certain threshold. Novelty \(\rho\) is defined as the average (Euclidean) distance in measure space to the \(k\)-nearest neighbors of the solution in the archive:
\[\rho(x) = \frac{1}{k}\sum_{i=1}^{k}\text{dist}(x, \mu_i)\]Where \(x\) is the measure value of some solution, and \(\mu_{1..k}\) are the measure values of the \(k\)-nearest neighbors in measure space.
This archive also supports the local competition computation from Novelty Search with Local Competition, described in Lehman 2011b.
Note
When used for diversity optimization, this archive does not require any objectives, and
objective=Nonecan be passed intoadd(). For consistency with the rest of pyribs,objective=Nonewill result in a default objective value of 0, which will also cause stats like QD score and best objective to be 0. Alternatively, it is possible to associate objectives with the solutions by passingobjectivetoadd()just like in other archives.Note
Some statistics will behave differently than in other archives:
If this archive has any solutions in it, the coverage (
archive.stats.coverage) will always be reported as 1. This is because the archive is unbounded, so there is no predefined number of cells to fill. As such,archive.stats.num_elitesmay provide a more meaningful coverage metric. It is also common to create aGridArchiveorCVTArchiveas a result archive, from which a meaningful coverage can be computed.Since the number of archive cells equals the number of elites in the archive, the normalized QD score (
archive.stats.norm_qd_score) will always equal the mean objective (archive.stats.obj_mean).
By default, this archive stores the following data fields:
solution,objective,measures, andindex. The integerindexuniquely identifies each cell.- Parameters:¶
- solution_dim: int | integer | tuple[int | integer, ...]¶
Dimensionality of the solution space. Scalar or multi-dimensional solution shapes are allowed by passing an empty tuple or tuple of integers, respectively.
- measure_dim: int | integer¶
Dimensionality of the measure space.
- k_neighbors: int | integer¶
The maximum number of nearest neighbors for computing novelty (maximum here is indicated since there may be fewer than
k_neighborssolutions in the archive).- novelty_threshold: float | floating¶
The level of novelty required to add a solution to the archive.
- local_competition: bool =
False¶ Whether to turn on local competition behavior. If turned on, the archive will require objectives to be passed in during
add(). Furthermore, theadd_inforeturned byadd()will include local competition information. Finally, solutions can be replaced in the archive. Specifically, if a candidate solution’s novelty is below the novelty threshold, its objective will be compared to that of its nearest neighbor. If the candidate’s objective is higher, it will replace the nearest neighbor.- initial_capacity: int | integer =
128¶ Since this archive is unstructured, it does not have a fixed size, and it will grow as solutions are added. In the implementation, we store solutions in fixed-size arrays, and every time the capacity of these arrays is reached, we double their sizes (similar to the vector in C++). This parameter determines the initial capacity of the archive’s arrays. It may be useful when it is known in advance how large the archive will grow.
- qd_score_offset: float | floating =
0.0¶ Archives often contain negative objective values, and if the QD score were to be computed with these negative objectives, the algorithm would be penalized for adding new cells with negative objectives. Thus, a standard practice is to normalize all the objectives so that they are non-negative by introducing an offset. This QD score offset will be subtracted from all objectives in the archive, e.g., if your objectives go as low as -300, pass in -300 so that each objective will be transformed as
objective - (-300).- seed: int | integer | None =
None¶ Value to seed the random number generator. Set to None to avoid a fixed seed.
- solution_dtype: numpy.typing.DTypeLike =
None¶ Data type of the solutions. Defaults to float64 (numpy’s default floating point type).
- objective_dtype: numpy.typing.DTypeLike =
None¶ Data type of the objectives. Defaults to float64 (numpy’s default floating point type).
- measures_dtype: numpy.typing.DTypeLike =
None¶ Data type of the measures. Defaults to float64 (numpy’s default floating point type).
- dtype: numpy.typing.DTypeLike =
None¶ Shortcut for providing data type of the solutions, objectives, and measures. Defaults to float64 (numpy’s default floating point type). This parameter sets all the dtypes simultaneously. To set individual dtypes, pass
solution_dtype,objective_dtype, andmeasures_dtype. Note thatdtypecannot be used at the same time as those parameters.- extra_fields: dict[str, tuple[int | integer | tuple[int | integer, ...], DTypeLike]] | None =
None¶ Description of extra fields of data that are stored next to elite data like solutions and objectives. The description is a dict mapping from a field name (str) to a tuple of
(shape, dtype). For instance,{"foo": ((), np.float32), "bar": ((10,), np.float32)}will create a “foo” field that contains scalar values and a “bar” field that contains 10D values. Note that field names must be valid Python identifiers, and names already used in the archive are not allowed.- kdtree_kwargs: dict | None =
None¶ When computing nearest neighbors, we construct a
KDTree. This parameter will pass additional kwargs when constructing the tree. By default, we do not pass in any kwargs.
- Raises:¶
ValueError –
initial_capacitymust be at least 1.
Methods
__iter__()Creates an iterator over the elites in the archive.
__len__()Number of elites in the archive.
add(solution, objective, measures, **fields)Inserts a batch of solutions into the archive.
add_single(solution, objective, measures, ...)Inserts a single solution into the archive.
clear()Removes all elites in the archive.
compute_novelty(measures[, local_competition])Computes the novelty and local competition of the given measures.
data()Returns data of the elites in the archive.
index_of(measures)Returns the index of the closest solution to the given measures.
index_of_single(measures)Returns the index of the measures for one solution.
retrieve(measures)Queries the archive for elites with the given batch of measures.
retrieve_single(measures)Queries the archive for an elite with the given measures.
sample_elites(n[, replace])Randomly samples elites from the archive.
Attributes
The elite with the highest objective in the archive.
Number of solutions that can currently be stored in this archive.
Included for API compatibility; equivalent to
__len__().Mapping from field name to dtype for all fields in the archive.
Whether the archive is empty.
List of data fields in the archive.
The number of nearest neighbors for computing novelty.
Whether local competition behavior is turned on.
Dimensionality of the measure space.
The degree of novelty required add a solution to the archive.
Dimensionality of the objective space.
Subtracted from objective values when computing the QD score.
Dimensionality of the solution space.
Statistics about the archive.
- add(solution: ArrayLike, objective: ArrayLike | None, measures: ArrayLike, **fields: ArrayLike) BatchData[source]¶
Inserts a batch of solutions into the archive.
Solutions are inserted if they have a high enough novelty score as discussed in the documentation for this class. The novelty is determined by comparing to solutions currently in the archive.
If
local_competitionis turned on, solutions can also replace existing solutions in the archive. Namely, if the solution was not novel enough to be added, it will be compared to its nearest neighbor, and if it exceeds the objective value of its nearest neighbor, it will replace the nearest neighbor. If there are conflicts where multiple solutions may replace a single solution, the highest-performing is chosen.Note
The indices of all arguments should “correspond” to each other, i.e.
solution[i],objective[i],measures[i], and should be the solution parameters, objective, and measures for solutioni.- Parameters:¶
- solution: ArrayLike¶
(batch_size,
solution_dim) array of solution parameters.- objective: ArrayLike | None¶
A value of None will cause the objective values to default to 0. However, if the user wishes to associate an objective with each solution, this can be a (batch_size,) array with objective function evaluations of the solutions. If
local_competitionis turned on, this argument must be provided.- measures: ArrayLike¶
(batch_size,
measure_dim) array with measure space coordinates of all the solutions.- **fields: ArrayLike¶
Additional data for each solution. Each argument should be an array with batch_size as the first dimension.
- Returns:¶
Information describing the result of the add operation. The dict contains the following keys:
"status"(numpy.ndarrayofnumpy.int32): An array of integers that represent the “status” obtained when attempting to insert each solution in the batch. Each item has the following possible values:0: The solution was not added to the archive.1: The solution replaced an existing solution in the archive due to having a higher objective (only applies iflocal_competitionis turned on).2: The solution was added to the archive due to being sufficiently novel.
To convert statuses to a more semantic format, cast all statuses to
AddStatuse.g. with[AddStatus(s) for s in add_info["status"]]."novelty"(numpy.ndarrayofdtypes[“measures”]): The computed novelty of the solutions passed in. If there were no solutions to compute novelty with respect to (i.e., the archive was empty), the novelty is set to thenovelty_threshold."local_competition"(numpy.ndarrayofint): Only available iflocal_competitionis turned on. Indicates, for each solution, how many of the nearest neighbors had lower objective values. Maximum value isk_neighbors. If there were no solutions to compute novelty with respect to, (i.e., the archive was empty), the local competition is set to 0."value"(numpy.ndarrayofdtypes[“objective”]): Only available iflocal_competitionis turned on. The meaning of each value depends on the correspondingstatusand is inspired by the values in CMA-ME (Fontaine 2020):0(not added): The value is the “negative improvement,” i.e., the objective of the solution passed in minus the objective of the nearest neighbor (this value is negative because the solution did not have a high enough objective to be added to the archive).1(replace/improve existing solution): The value is the “improvement,” i.e., the objective of the solution passed in minus the objective of the elite that was replaced.2(new solution): The value is just the objective of the solution.
- Raises:¶
ValueError – The array arguments do not match their specified shapes.
ValueError –
objectiveormeasureshas non-finite values (inf or NaN).ValueError –
local_competitionis turned on but objective was not passed in.
- add_single(solution: ArrayLike, objective: ArrayLike | None, measures: ArrayLike, **fields: ArrayLike) SingleData[source]¶
Inserts a single solution into the archive.
- Parameters:¶
- Returns:¶
Information describing the result of the add operation. The dict contains
statusandnoveltykeys; refer toadd()for the meaning of status and novelty.- Raises:¶
ValueError – The array arguments do not match their specified shapes.
ValueError –
objectiveis non-finite (inf or NaN) ormeasureshas non-finite values.ValueError –
local_competitionis turned on but objective was not passed in.
-
compute_novelty(measures: ArrayLike, local_competition: ArrayLike | None =
None) np.ndarray | tuple[np.ndarray, np.ndarray][source]¶ Computes the novelty and local competition of the given measures.
- Parameters:¶
- measures: ArrayLike¶
(batch_size,
measure_dim) array of coordinates in measure space.- local_competition: ArrayLike | None =
None¶ This can be None to indicate not to compute local competition. Otherwise, it can be a (batch_size,) array of objective values to use as references for computing objective values.
- Returns:¶
(batch_size,) array holding the novelty score of each measure. If the archive is empty, the novelty is set to the
novelty_threshold.If
local_competitionis passed in, a (batch_size,) array holding the local competition of each solution will also be returned. If the archive is empty, the local competition will be set to 0.
- Return type:¶
Either one array or a tuple of two arrays
-
data(fields: str, return_type: 'dict' | 'tuple' | 'pandas' =
'dict') ndarray[source]¶ -
data(fields: None | Collection[str] =
None, return_type: 'dict' ='dict') dict[str, ndarray] -
data(fields: None | Collection[str] =
None, return_type: 'tuple' ='tuple') tuple[ndarray] -
data(fields: None | Collection[str] =
None, return_type: 'pandas' ='pandas') ArchiveDataFrame Returns data of the elites in the archive.
- Parameters:¶
- fields: str¶
- fields: None | Collection[str] =
None List of fields to include, such as
"solution","objective","measures", and other fields in the archive. This can also be a single str indicating a field name.- return_type: 'dict' | 'tuple' | 'pandas' =
'dict'¶ - return_type: 'dict' =
'dict' - return_type: 'tuple' =
'tuple' - return_type: 'pandas' =
'pandas' Data to return; see below. Ignored if
fieldsis a str.
- Returns:¶
The data for all elites in the archive. All data returned by this method will be a copy, i.e., the data will not update as the archive changes. If
fieldswas a single str, the returned data will just be an array holding data for the given field, such as:measures = archive.data("measures")Otherwise, the returned data can take the following forms, depending on the
return_typeargument:return_type="dict": Dict mapping from the field name to the field data at the given indices. An example is:{ "solution": [[1.0, 1.0, ...], ...], "objective": [1.5, ...], "measures": [[1.0, 2.0], ...], ... }The keys in this dict can be modified with the
fieldsarg; duplicate fields will be ignored since the dict stores unique keys.return_type="tuple": Tuple of arrays matching the field order infields. For instance, iffieldsis["objective", "measures"], this method would return a tuple of(objective_arr, measures_arr)that could be unpacked as:objective, measures = archive.data(["objective", "measures"], return_type="tuple")Unlike with the
dictreturn type, duplicate fields will show up as duplicate entries in the tuple, e.g.,fields=["objective", "objective"]will result in two objective arrays being returned.When
fields=None(the default case), the fields in the tuple will be ordered according to thefield_list.return_type="pandas": AnArchiveDataFramewith the following columns:For fields that are scalars, a single column with the field name. For example,
objectivewould have a single column calledobjective.For fields that are 1D arrays, multiple columns with the name suffixed by its index. To illustrate, for a
measuresfield of length 10, the dataframe would contain 10 columns with namesmeasures_0,measures_1, …,measures_9. The output format for fields with >1D data is currently not defined.
In short, the dataframe might look like this by default:
solution_0
…
objective
measures_0
…
…
…
Like the other return types, the columns returned can be adjusted with the
fieldsparameter.
- Raises:¶
ValueError – Invalid field name provided.
ValueError – Invalid return_type provided.
ValueError – Passed
return_type="pandas"when one of the fields has >1D data.
- index_of(measures: numpy.typing.ArrayLike) ndarray[source]¶
Returns the index of the closest solution to the given measures.
Unlike the structured archives like
GridArchive, this archive does not have indexed cells where each measure “belongs.” Thus, this method instead returns the index of the solution with the closest measure to each solution passed in.This means that
retrieve()will return the solution with the closest measure to each measure passed into that method.- Parameters:¶
- measures: numpy.typing.ArrayLike¶
(batch_size,
measure_dim) array of coordinates in measure space.
- Returns:¶
(batch_size,) array of integer indices representing the location of the solution in the archive.
- Raises:¶
RuntimeError – There were no entries in the archive.
ValueError –
measuresis not of shape (batch_size,measure_dim).ValueError –
measureshas non-finite values (inf or NaN).
- index_of_single(measures: numpy.typing.ArrayLike) int | integer[source]¶
Returns the index of the measures for one solution.
See
index_of().- Parameters:¶
- measures: numpy.typing.ArrayLike¶
(
measure_dim,) array of measures for a single solution.
- Returns:¶
Integer index of the measures in the archive’s storage arrays.
- Raises:¶
ValueError –
measuresis not of shape (measure_dim,).ValueError –
measureshas non-finite values (inf or NaN).
- retrieve(measures: numpy.typing.ArrayLike) tuple[ndarray, dict[str, ndarray]][source]¶
Queries the archive for elites with the given batch of measures.
This method operates in batch. It takes in a batch of measures and outputs the batched data for the elites:
occupied, elites = archive.retrieve(...) occupied # Shape: (batch_size,) elites["solution"] # Shape: (batch_size, solution_dim) elites["objective"] # Shape: (batch_size, objective_dim) elites["measures"] # Shape: (batch_size, measure_dim) ...occupiedindicates whether an elite was found for each measure, i.e., whether the archive was occupied at each queried measure. Ifoccupied[i]is True, thenelites["solution"][i],elites["objective"][i],elites["measures"][i], and other fields will contain the data of the elite for the inputmeasures[i]. Ifoccupied[i]is False, then those fields will instead have arbitrary values, e.g.,elites["solution"][i]may be set to all NaN.- Parameters:¶
- measures: numpy.typing.ArrayLike¶
(batch_size,
measure_dim) array of measure space points at which to retrieve solutions.
- Returns:¶
2-element tuple of (boolean
occupiedarray, dict of elite data). See above for description.- Raises:¶
ValueError –
measuresis not of shape (batch_size,measure_dim).ValueError –
measureshas non-finite values (inf or NaN).
- retrieve_single(measures: numpy.typing.ArrayLike) tuple[bool, dict[str, Any]][source]¶
Queries the archive for an elite with the given measures.
While
retrieve()takes in a batch of measures, this method takes in the measures for only one solution and returns a single bool and a dict with single entries:occupied, elite = archive.retrieve_single(...) occupied # Bool elite["solution"] # Shape: (solution_dim,) elite["objective"] # Shape: (objective_dim,) elite["measures"] # Shape: (measure_dim,) ...- Parameters:¶
- measures: numpy.typing.ArrayLike¶
(
measure_dim,) array of measures.
- Returns:¶
2-element tuple of (boolean, dict of data for one elite)
- Raises:¶
ValueError –
measuresis not of shape (measure_dim,).ValueError –
measureshas non-finite values (inf or NaN).
-
sample_elites(n: int | integer, replace: bool =
True) dict[str, ndarray][source]¶ Randomly samples elites from the archive.
Currently, this sampling is done uniformly at random, either with or without replacement. Additional sampling methods may be supported in the future.
Example
elites = archive.sample_elites(16) elites["solution"] # Shape: (16, solution_dim) elites["objective"] elites["measures"] ...- Parameters:¶
- Returns:¶
A batch of elites randomly selected from the archive.
- Raises:¶
IndexError – The archive is empty.
ValueError –
nwas greater than the number of elites in the archive whenreplace=False.
- property best_elite : dict[str, Any]¶
The elite with the highest objective in the archive.
None if there are no elites in the archive.
- property capacity : int¶
Number of solutions that can currently be stored in this archive.
The capacity doubles every time the archive fills up.
- property cells : int¶
Included for API compatibility; equivalent to
__len__().Strictly speaking, this archive does not have “cells” since it does not have a tessellation like other archives. However, for API compatibility, we set the number of cells as equal to the number of solutions currently in the archive.
- property objective_dim : tuple[()] | int | integer¶
Dimensionality of the objective space.
The empty tuple
()indicates a scalar objective.
- property solution_dim : int | integer | tuple[int | integer, ...]¶
Dimensionality of the solution space.
- property stats : ArchiveStats¶
Statistics about the archive.
See
ArchiveStatsfor more info.