# What's New in v0.7.0 The updates in v0.7.0 centered around making the archives more flexible and adding new algorithmic features. Below we describe some of the key changes. For the full list of changes, please refer to our [History page](./history). ## More Flexible Archives We refactored our archives to build on a data structure we call an {class}`~ribs.archives.ArrayStore`. An ArrayStore is essentially a dict mapping from names ("fields") to fixed-size arrays. Archives store data like solutions, objectives, and measures as fields in the ArrayStore. Building on ArrayStore enabled us to create a more flexible API, which also meant introducing several **breaking changes.** Below we list all the updates to the archives, ordered by how likely they are to affect users. ### as_pandas() is deprecated in favor of data() `archive.as_pandas()` has now been deprecated in favor of calling `archive.data()`, which is a much more flexible method. Below are several examples of the {meth}`~ribs.archives.ArchiveBase.data` method: ```python # Returns a dict with all fields in the archive, e.g., # # { # "solution": [[1.0, 1.0, ...], ...], # "objective": [1.5, ...], # "measures": [[1.0, 2.0], ...], # "threshold": [0.8, ...], # "index": [4, ...], # } archive.data() # Returns a single array -- in this case, the shape will be (num elites,). # We think this will be the most useful variant of data(). objective = archive.data("objective") # Returns a dict with just the listed fields, e.g., # # { # "objective": [1.5, ...], # "measures": [[1.0, 2.0], ...], # } archive.data(["objective", "measures"]) # Returns a tuple with just the listed fields, e.g., # # ( # [1.5, ...], # [[1.0, 2.0], ...], # ) archive.data(["objective", "measures"], return_type="tuple") # Returns an ArchiveDataFrame -- see below for several differences from the # as_pandas ArchiveDataFrame. archive.data(return_type="pandas") ``` In general, we believe users will find the single-field version (e.g., `archive.data("objective")` the most useful, with `archive.data(return_type="pandas")` serving as a close replacement for `as_pandas`. However, we note several differences in the ArchiveDataFrame returned by `data()`: 1. Columns previously named `measure_X` are now named `measures_X` for consistency with other fields. 1. The columns are in a different order from before. 1. Iterating over an ArchiveDataFrame now returns a dict rather than the previous `Elite` namedtuple. 1. ArchiveDataFrame no longer has batch() methods. Instead, it has a get_field() method that converts columns back into their arrays, e.g., `df.get_field("objective")`. ### Metadata has been removed in favor of custom archive fields Previously, archives stored `metadata`, which were arbitrary objects associated with each solution or elite. In pyribs 0.7.0, we have removed metadata and instead support custom fields in archives. The example below shows how to use custom fields -- pay attention to the `extra_fields` in the archive definition, and the kwargs in `scheduler.tell()`. ```python import numpy as np from ribs.archives import GridArchive from ribs.emitters import EvolutionStrategyEmitter from ribs.schedulers import Scheduler archive = GridArchive( solution_dim=10, dims=[20, 20], ranges=[(-1, 1), (-1, 1)], # `extra_fields` is a dict mapping from "name" to a tuple of (shape, dtype). # Thus, extra_scalar is a scalar field of type float32, while extra_vector # is a length 10 vector field of type int32. This also works for other # archives. extra_fields={ "extra_scalar": ((), np.float32), "extra_vector": ((10,), np.int32), }, ) # Emitter and scheduler definition -- feel free to skip over. emitters = [ EvolutionStrategyEmitter( archive, x0=[0.0] * 10, sigma0=0.1, ) for _ in range(3) ] scheduler = Scheduler(archive, emitters) solutions = scheduler.ask() # The extra_fields become important in scheduler.tell(), when they must be # passed in along with the usual objectives and measures. This also works for # tell_dqd() in the case of DQD algorithms. scheduler.tell( # The objective is the negative Sphere function, while the measures are the # first two coordinates of the 10D solution. Note that keyword arguments are # optional here (i.e., objective= and measures=). -np.sum(np.square(solutions), axis=1), solutions[:, :2], # The extra_fields specified in the archive must be passed in as kwargs. extra_scalar=solutions[:, 0], extra_vector=np.zeros((len(solutions), 10), dtype=np.int32), ) ``` Notably, it is possible to recover the original metadata behavior by defining a `metadata` field as follows: ```python archive = GridArchive( solution_dim=10, dims=[20, 20], ranges=[(-1, 1), (-1, 1)], extra_fields={ "metadata": ((), object), }, ) ``` ### Additional Changes #### retrieve() no longer returns EliteBatch {meth}`~ribs.archives.ArchiveBase.retrieve` now returns a tuple of two objects: (1) an `occupied` array indicating whether the given cells were occupied, and (2) a dict containing the data of the elites in the given cells. Entries in the dict are only valid if their corresponding cell was occupied. More info: {pr}`414`. #### Parameter names no longer include \_batch Parameters for methods like {meth}`~ribs.archives.ArchiveBase.add` and {meth}`~ribs.schedulers.Scheduler.tell` have been renamed to remove the `_batch` suffix, as it is usually clear that we take in batch arguments. Methods that require single arguments are already named with the `_single` suffix, e.g., `add_single` and `retrieve_single`. #### Thresholds now included in elite data The archive threshold is now included in {attr}`~ribs.archives.ArchiveBase.best_elite` ({pr}`409`) and in data returned by {meth}`~ribs.archives.ArchiveBase.retrieve` ({pr}`414`). #### Elite and EliteBatch namedtuples are deprecated The Elite and EliteBatch namedtuples have been removed, and methods will now return dicts instead). This allows us to support custom field names. In particular, iteration over an archive will now yield a dict instead of the Elite namedtuple. More info: {pr}`397` #### add() methods now return add_info dict Instead of returning separate status and value arrays, the archive {meth}`~ribs.archives.ArchiveBase.add` method now returns a dict that we refer to as `add_info`. The `add_info` contains keys for `status` and `value` and may contain further info in the future. Correspondingly, emitter methods like {meth}`~ribs.emitters.EmitterBase.tell` now take in `add_info` instead of separate `status_batch` and `value_batch` arguments. More info: {pr}`430` ## New Algorithmic Features ### Using pycma in Emitters We added the {class}`~ribs.emitters.opt.PyCMAEvolutionStrategy` to support using [pycma](https://github.com/CMA-ES/pycma) in emitters like the {class}`~ribs.emitters.EvolutionStrategyEmitter`. The ES may be used by passing `es="pycma_es"` to such emitters. Before using this, make sure that pycma is installed, either by running `pip install cma` or `pip install ribs[pycma]`. ### New centroid generation methods in CVTArchive Drawing from [Mouret 2023](https://dl.acm.org/doi/10.1145/3583133.3590726), we now support alternative methods for generating centroids in {class}`~ribs.archives.CVTArchive`. These methods may be specified via the `centroid_method` parameter, for example: ```python from ribs.archives import CVTArchive archive = CVTArchive( solution_dim=10, cells=100, ranges=[(0.1, 0.5), (-0.6, -0.2)], # Alternatives: "kmeans" (default), "sobol", "scrambled_sobol", "halton" centroid_method="random", ) ``` ### OMG-MEGA and OG-MAP-Elites We have added the {class}`~ribs.emitters.GradientOperatorEmitter` to support the OMG-MEGA and OG-MAP-Elites baseline algorithms from [Fontaine 2021](https://arxiv.org/abs/2106.03894). The emitter may be used as follows: ```python from ribs.emitters import GradientOperatorEmitter # For OMG-MEGA GradientOperatorEmitter( sigma=0.0, sigma_g=10.0, measure_gradients=True, normalize_grad=True, ) # For OG-MAP-Elites GradientOperatorEmitter( sigma=0.5, sigma_g=0.5, measure_gradients=False, normalize_grad=False, ) ```