ribs.discount_models.DiscountModelManager

class ribs.discount_models.DiscountModelManager(model: nn.Module, optimizer: torch.optim.Optimizer, device: torch.device, *, train_epochs: Int, train_cutoff_loss: Float, train_batch_size: Int, normalize_measures: 'zero_one' | 'negative_one_one' | None = None, measures_low: ArrayLike | None = None, measures_high: ArrayLike | None = None, normalize_discount: 'zero_one' | 'negative_one_one' | None = None, discount_low: Float | None = None, discount_high: Float | None = None)[source]

Wraps a PyTorch model so it can be used as a discount model.

This class handles operations like training the model to match new discount value targets (in training_loop()) and performing inference (in inference()).

Note

This class assumes all input and output data is of type float32, which is the default type in PyTorch. If different data types are needed, one solution may be to cast the data before/after calls to this class (as is done in DiscountArchive).

Note

This class requires PyTorch to be installed, e.g., by running pip install torch.

Parameters:
model: nn.Module

A PyTorch model that can take in batches of measures and output batches of scalar discount values. We assume this model has already been placed on the desired device.

optimizer: torch.optim.Optimizer

A PyTorch optimizer that is set up to optimize the model’s parameters. We use this to train the discount model to output new discount value targets. The optimizer state is maintained across calls to training_loop().

device: torch.device

A PyTorch device for placing tensors during training.

train_epochs: Int

When training_loop() is called, the model will train until either (1) the total loss on each epoch is less than the train_cutoff_loss described below, or (2) the number of epochs reaches train_epochs.

train_cutoff_loss: Float

See train_epochs.

train_batch_size: Int

During each epoch of training_loop(), the dataset of measures and targets will be used to train the model with this batch size.

normalize_measures: 'zero_one' | 'negative_one_one' | None = None

Whether to normalize the measures. Pass None (default) to indicate no normalization. Alternatively, pass “zero_one” to normalize to [0, 1] or “negative_one_one” to normalize to [-1, 1] (along each dimension). To normalize to these values, we linearly transform from the range defined by measures_low and measures_high, described below.

measures_low: ArrayLike | None = None

If normalize_measures is set, this is the lower bound of the measures for normalizing.

measures_high: ArrayLike | None = None

If normalize_measures is set, this is the upper bound of the measures for normalizing.

normalize_discount: 'zero_one' | 'negative_one_one' | None = None

Whether to normalize the discount values. Pass None (default) to indicate no normalization. During training, the targets are linearly transformed to a target range such as [0, 1], and during inference, the discount values output by the model are un-normalized before being returned. Pass “zero_one” to set the range to [0, 1], and “negative_one_one” to set the range to [-1, 1].

discount_low: Float | None = None

If normalize_discount is set, this is the lower bound of the discount values for normalizing.

discount_high: Float | None = None

If normalize_discount is set, this is the upper bound of the discount values for normalizing.

Methods

inference(measures[, batch_size])

Computes discount values at the given measures using the model.

training_loop(measures, targets)

Regresses the discount model to match the given targets at the given measures.

inference(measures: numpy.typing.ArrayLike, batch_size: int | None = None) ndarray[source]

Computes discount values at the given measures using the model.

This method also temporarily puts the model in eval mode and uses torch.no_grad.

Parameters:
measures: numpy.typing.ArrayLike

Inputs to the model of size (n_measures, measure_dim).

batch_size: int | None = None

If passed in, the model will only be passed batch_size inputs at a time. This can be useful if, for instance, the model is very large and there is insufficient memory to handle many inputs simultaneously.

Returns:

The discount values at the input measures.

training_loop(measures: numpy.typing.ArrayLike, targets: numpy.typing.ArrayLike) list[float][source]

Regresses the discount model to match the given targets at the given measures.

Training proceeds until either (1) the total loss on each epoch is less than the train_cutoff_loss, or (2) the number of epochs reaches train_epochs. The loss function used during training is MSELoss.

Parameters:
measures: numpy.typing.ArrayLike

(batch_size, measure_dim) array of measure values.

targets: numpy.typing.ArrayLike

(batch_size,) array of target values for the discount function.

Returns:

A list with the total MSE loss accumulated on each epoch, normalized/divided by the size of the dataset. Strictly speaking, the model is updated after every batch is passed through it, so this is not the loss that one would obtain if the measures were all passed through the model at once.