ribs.archives.DensityArchive

class ribs.archives.DensityArchive(*, measure_dim: int | integer, buffer_size: int | integer = 10000, density_method: 'kde' | 'kde_sklearn' | 'cnf' = 'kde', bandwidth: float | floating | None = None, sklearn_kwargs: dict | None = None, cnf_kwargs: dict | None = None, cnf_lr: float | floating = 0.001, cnf_train_steps: int | integer = 5, cnf_batch_size: int | integer = 32, cnf_min_buffer_size: int | integer = 128, cnf_device: device = 'cpu', seed: int | integer | None = None, measures_dtype: numpy.typing.DTypeLike = None, dtype: numpy.typing.DTypeLike = None)[source]

An archive that models the density of solutions in measure space.

This archive originates in Density Descent Search in Lee 2024. It maintains a buffer of measures, and using that buffer, it builds a density estimator such as a KDE. The density estimator indicates which areas of measure space have, for instance, a high density of solutions – to improve exploration, an algorithm would need to target areas with a low density of solutions.

Incoming solutions are added to the buffer with reservoir sampling, specifically as described in Li 1994. Reservoir sampling enables sampling uniformly from the incoming stream of solutions generated by the emitters.

Unlike other archives, this archive does not store any elites, and as such, most methods from ArchiveBase are not implemented. Rather, it is assumed that a separate result_archive (see Scheduler) will store solutions when using this archive.

Note

When density_method="cnf", this class requires PyTorch and Zuko to be installed, e.g., by running pip install torch zuko>=1.0.0.

Note

For DDS-CNF, the default cnf parameters, particularly cnf_kwargs, cnf_train_steps, and cnf_batch_size, are tuned to provide reasonable performance on the Sphere example (Sphere Function with Various Algorithms). Further tuning may be needed for new domains. Additionally, these parameters are intended to reduce the computational cost of DDS-CNF when compared to the original parameters in Lee 2024. See #707 for more info.

Parameters:
measure_dim: int | integer

Dimension of the measure space.

buffer_size: int | integer = 10000

Size of the buffer of measures.

density_method: 'kde' | 'kde_sklearn' | 'cnf' = 'kde'

Method for computing density. Supports "kde" (KDE – kernel density estimator), "kde_sklearn" (KDE using sklearn.neighbors.KernelDensity), and "cnf" (continuous normalizing flow, i.e., DDS-CNF from Lee 2024). When "kde_sklearn" is used, this archive computes log density rather than density; see sklearn.neighbors.KernelDensity.score_samples(). When "cnf" is used, this archive also returns log density since that is what the flow models directly. "cnf" requires installing torch and zuko.

bandwidth: float | floating | None = None

Bandwidth when using kde or kde_sklearn as the density_method.

sklearn_kwargs: dict | None = None

kwargs for sklearn.neighbors.KernelDensity when using "kde_sklearn" as the density_method. Note that bandwidth is already passed in via the bandwidth parameter above.

cnf_kwargs: dict | None = None

Additional keyword arguments forwarded to zuko.flows.continuous.CNF when density_method="cnf". features is set automatically from measure_dim and cannot be overridden. Defaults to {"hidden_features": (64, 64)}.

cnf_lr: float | floating = 0.001

Adam learning rate used to fine-tune the CNF during each call to add() when density_method="cnf".

cnf_train_steps: int | integer = 5

Number of Adam steps taken every time the CNF is fine-tuned on the buffer.

cnf_batch_size: int | integer = 32

Mini-batch size used when fine-tuning the CNF. If the buffer has fewer points, the entire buffer is used.

cnf_min_buffer_size: int | integer = 128

Minimum number of points in the buffer before the CNF is trained. Before this threshold, the flow stays untrained and density queries return zeros.

cnf_device: device = 'cpu'

Torch device on which the CNF lives when density_method="cnf".

seed: int | integer | None = None

Value to seed the random number generator. Set to None to avoid a fixed seed.

measures_dtype: numpy.typing.DTypeLike = None

Data type of the measures. Defaults to float64 (numpy’s default floating point type).

dtype: numpy.typing.DTypeLike = None

Alternative for providing data type of the measures. Included for API compatibility. Cannot be used at the same time as measures_dtype.

Raises:
  • ValueError – Unknown density_method provided.

  • ImportErrordensity_method="cnf" is requested but torch or zuko is not installed.

Methods

add(solution, objective, measures, **fields)

Adds measures to the buffer and updates the density estimator if necessary.

compute_density(measures)

Computes density at the given points in measure space.

Attributes

buffer

Buffer of measures considered in the density estimator.

empty

Whether the archive is empty; always False.

measure_dim

Dimensionality of the measure space.

objective_dim

Dimensionality of the objective space.

solution_dim

Dimensionality of the solution space.

add(solution: ArrayLike | None, objective: ArrayLike | None, measures: ArrayLike, **fields: ArrayLike | None) BatchData[source]

Adds measures to the buffer and updates the density estimator if necessary.

The measures are added to the buffer with reservoir sampling to enable sampling uniformly from the incoming solutions.

Parameters:
solution: ArrayLike | None

Included for API consistency. Any value is ignored.

objective: ArrayLike | None

Included for API consistency. Any value is ignored.

measures: ArrayLike

(batch_size, measure_dim) array with measure space coordinates of all the solutions.

**fields: ArrayLike | None

Included for API consistency. Any value is ignored.

Returns:

Information describing the result of the add operation. The dict contains the following keys:

  • "status" (numpy.ndarray of np.int32): An array of integers that represent the “status” obtained when attempting to insert each solution in the batch. Since this archive does not store any elites, all statuses are set to 2 (which normally indicates the solution discovered a new cell in the archive – see AddStatus).

  • "density" (numpy.ndarray of the dtype passed in at init): The density values of the measure passed in, before the buffer or density estimator was updated. Note that when "kde_sklearn" or "cnf" is used as the density_method, log density is computed rather than density. For more info, see sklearn.neighbors.KernelDensity.score_samples() for "kde_sklearn" and the class-level docstring for "cnf".

Raises:
  • ValueError – The array arguments do not match their specified shapes.

  • ValueErrormeasures has non-finite values (inf or NaN).

compute_density(measures: numpy.typing.ArrayLike) ndarray[source]

Computes density at the given points in measure space.

Parameters:
measures: numpy.typing.ArrayLike

(batch_size, measure_dim) array with measure space coordinates of all the solutions.

Returns:

(batch_size,) array of density values of the input solutions.

property buffer : ndarray

Buffer of measures considered in the density estimator.

Shape (n, measure_dim).

property empty : bool

Whether the archive is empty; always False.

Since the archive does not store elites, we always mark it as not empty.

property measure_dim : int | integer

Dimensionality of the measure space.

property objective_dim : tuple[()] | int | integer

Dimensionality of the objective space.

The empty tuple () indicates a scalar objective.

property solution_dim : int | integer | tuple[int | integer, ...]

Dimensionality of the solution space.