ribs.archives.ArrayStore¶
-
class ribs.archives.ArrayStore(field_desc: dict[str, tuple[int | integer | tuple[int | integer, ...], DTypeLike]], capacity: int | integer, xp: ModuleType | None =
None, device: str | int | device | Device =None)[source]¶ Maintains a set of arrays that share a common dimension.
The ArrayStore consists of several fields of data that are manipulated simultaneously via batch operations. Each field is an array with a dimension of
(capacity, ...)and can be of any type.Since the arrays all share a common first dimension, they also share a common index. For instance, if we
retrieve()the data at indices[0, 2, 1], we would get a dict that contains the objective and measures at indices 0, 2, and 1, e.g.:{ "objective": [-1, 3, -5], "measures": [[0, 0], [2, 1], [3, 5]], }The ArrayStore supports several further operations, such as an
add()method that inserts data into the ArrayStore.By default, the arrays in the ArrayStore are NumPy arrays. However, through support for the Python array API standard, it is possible to use arrays from other libraries like PyTorch by passing in arguments for
xpanddevice.- Parameters:¶
- field_desc: dict[str, tuple[int | integer | tuple[int | integer, ...], DTypeLike]]¶
Description of fields in the array store. The description is a dict mapping from a str to a tuple of
(shape, dtype). For instance,{"objective": ((), np.float32), "measures": ((10,), np.float32)}will create an “objective” field with shape(capacity,)and a “measures” field with shape(capacity, 10). Note that field names must be valid Python identifiers.- capacity: int | integer¶
Total possible entries in the store.
- xp: ModuleType | None =
None¶ Optional array namespace. Should be compatible with the array API standard, or supported by array-api-compat. Defaults to
numpy.- device: str | int | device | Device =
None¶ Device for arrays.
- Variables:¶
- _props : dict
Properties that are common to every ArrayStore.
”capacity”: Maximum number of data entries in the store.
”occupied”: Boolean array of size
(capacity,)indicating whether each index has data associated with it.”n_occupied”: Number of data entries currently in the store.
”occupied_list”: Array of size
(capacity,)listing all occupied indices in the store. Only the firstn_occupiedelements will be valid.”updates”: Int list recording number of calls to functions that modified the store.
- _fields : dict
Holds all the arrays with their data.
- Raises:¶
ValueError – One of the fields in
field_deschas a reserved name (currently, “index” is the only reserved name).ValueError – One of the fields in
field_deschas a name that is not a valid Python identifier.
Methods
__iter__()Iterates over entries in the store.
__len__()Number of occupied indices in the store.
add(indices, data)Adds new data to the store at the given indices.
clear()Removes all entries from the store.
data()Retrieves data for all entries in the store.
resize(capacity)Resizes the store to the given capacity.
retrieve()Collects data at the given indices.
Attributes
XP_NAMEMaximum number of data entries in the store.
Data types of fields in the store.
Data types of fields in the store, plus the index.
Description of fields in the store.
List of fields in the store.
List of fields in the store, plus the index.
(capacity,)Boolean array indicating whether each index has an entry.int32 array listing all occupied indices in the store.
- __iter__() Iterator[dict[str, Any]][source]¶
Iterates over entries in the store.
When iterated over, this iterator yields dicts mapping from the fields to the individual entries. For instance, if we had an “objective” field, one entry might look like
{"index": 1, "objective": 6.0}(similar toretrieve(), the index is included in the output).Example
for entry in store: entry["index"] entry["objective"] ...
- __len__() int[source]¶
Number of occupied indices in the store.
AKA, number of indices that have a corresponding data entry.
- add(indices: numpy.typing.ArrayLike, data: dict[str, numpy.typing.ArrayLike]) None[source]¶
Adds new data to the store at the given indices.
Example
indices = [4, 7, 8] data = {"objective": [1.0, 2.0, 3.0]} store.add(indices, data) ... # Now, index 4 will have `objective` of 1.0, index 7 will have # `objective` of 2.0, and index 8 will have objective of 3.0.- Parameters:¶
- Raises:¶
ValueError –
datadoes not have the same keys as the fields of this store.ValueError –
datahas fields that have a different length thanindices.
-
data(fields: str, return_type: 'dict' | 'tuple' | 'pandas' =
'dict') ndarray | Tensor | ndarray[source]¶ -
data(fields: None | Collection[str] =
None, return_type: 'dict' ='dict') dict[str, ndarray | Tensor | ndarray] -
data(fields: None | Collection[str] =
None, return_type: 'tuple' ='tuple') tuple[ndarray | Tensor | ndarray] -
data(fields: None | Collection[str] =
None, return_type: 'pandas' ='pandas') ArchiveDataFrame Retrieves data for all entries in the store.
Equivalent to calling
retrieve()withindicesset tooccupied_list.- Parameters:¶
- fields: str¶
- fields: None | Collection[str] =
None See
retrieve().- return_type: 'dict' | 'tuple' | 'pandas' =
'dict'¶ - return_type: 'dict' =
'dict' - return_type: 'tuple' =
'tuple' - return_type: 'pandas' =
'pandas' See
retrieve().
- Returns:¶
See
datainretrieve().occupiedis not returned since all indices are known to be occupied in this method.
-
retrieve(indices: ArrayLike, fields: str, return_type: 'dict' | 'tuple' | 'pandas' =
'dict') ndarray | Tensor | ndarray[source]¶ -
retrieve(indices: ArrayLike, fields: None | Collection[str] =
None, return_type: 'dict' ='dict') dict[str, ndarray | Tensor | ndarray] -
retrieve(indices: ArrayLike, fields: None | Collection[str] =
None, return_type: 'tuple' ='tuple') tuple[ndarray | Tensor | ndarray] -
retrieve(indices: ArrayLike, fields: None | Collection[str] =
None, return_type: 'pandas' ='pandas') ArchiveDataFrame Collects data at the given indices.
- Parameters:¶
- indices: ArrayLike¶
List of indices at which to collect data.
- fields: str¶
- fields: None | Collection[str] =
None List of fields to include. By default, all fields will be included, with an additional “index” as the last field. The “index” field can also be added anywhere in this list of fields. This argument can also be a single str indicating a field name.
- return_type: 'dict' | 'tuple' | 'pandas' =
'dict'¶ - return_type: 'dict' =
'dict' - return_type: 'tuple' =
'tuple' - return_type: 'pandas' =
'pandas' Type of data to return. See the
datareturned below. Ignored iffieldsis a str.
- Returns:¶
2-element tuple.
The first element is occupied, an array indicating which indices, among those passed in, have an associated data entry. For instance, if
indicesis[0, 1, 2]and only index 2 has data, thenoccupiedwill be[False, False, True]. Note that if a given index is not marked as occupied, it can have any data value associated with it. For instance, if index 1 was not occupied, then the 6.0 returned in thedictexample below should be ignored.The second element is data, the data at the given indices. If
fieldswas a single str, this will just be an array holding data for the given field. Otherwise, this data can take the following forms, depending on thereturn_typeargument:return_type="dict": Dict mapping from the field name to the field data at the given indices. For instance, if we have anobjectivefield and request data at indices[4, 1, 0], we would getdatathat looks like{"objective": [1.5, 6.0, 2.3], "index": [4, 1, 0]}. Observe that we also return the indices as anindexentry in the dict. The keys in this dict can be modified using thefieldsarg; duplicate keys will be ignored since the dict stores unique keys.return_type="tuple": Tuple of arrays matching the order given infields. For instance, iffieldswas["objective", "measures"], we would receive a tuple of(objective_arr, measures_arr). In this case, the results fromretrievecould be unpacked as:occupied, (objective, measures) = store.retrieve( ..., return_type="tuple", )Unlike with the
dictreturn type, duplicate fields will show up as duplicate entries in the tuple, e.g.,fields=["objective", "objective"]will result in two objective arrays being returned.By default, (i.e., when
fields=None), the fields in the tuple will be ordered according to thefield_descargument in the constructor, along withindexas the last field.return_type="pandas": AnArchiveDataFramewith the following columns (by default):For fields that are scalars, a single column with the field name. For example,
objectivewould have a single column calledobjective.For fields that are 1D arrays, multiple columns with the name suffixed by its index. For instance, if we have a
measuresfield of length 10, we create 10 columns with namesmeasures_0,measures_1, …,measures_9. We do not currently support fields with >1D data.1 column of integers (
np.int32) for the index, namedindex.
In short, the dataframe might look like this:
objective
measures_0
…
index
…
Like the other return types, the columns can be adjusted with the
fieldsparameter.Note
This return type will require copying all fields in the ArrayStore into NumPy arrays, if they are not already NumPy arrays.
All data returned by this method will be a copy, i.e., the data will not update as the store changes.
- Return type:¶
- Raises:¶
ValueError – Invalid field name provided.
ValueError – Invalid return_type provided.
ValueError – Passed
return_type="pandas"when one of the fields has >1D data.
- property dtypes : dict[str, dtype | dtype]¶
Data types of fields in the store.
Example
store.dtypes == { "objective": np.float32, "measures": np.float32, }
- property dtypes_with_index : dict[str, dtype | dtype]¶
Data types of fields in the store, plus the index.
Example
store.dtypes == { "objective": np.float32, "measures": np.float32, "index": np.int32, }
- property field_desc : dict[str, tuple[int | integer | tuple[int | integer, ...], DTypeLike]]¶
Description of fields in the store.
Example
store.field_desc == { "objective": ((), np.float32), "measures": ((10,), np.float32), }See the constructor
field_descparameter for more info. Unlike in the field_desc in the constructor, which accepts ints for 1D field shapes (e.g.,5), this field_desc shows 1D field shapes as tuples of 1 entry (e.g.,(5,)). Since dicts in Python are ordered, note that this dict will have the same order as in the constructor.
- property field_list : list[str]¶
List of fields in the store.
Example
store.field_list == ["objective", "measures"]
- property field_list_with_index : list[str]¶
List of fields in the store, plus the index.
The index is always added at the end of the list.
Example
store.field_list_with_index == ["objective", "measures", "index"]