ribs.discount_models.DiscountModelManager¶
-
class ribs.discount_models.DiscountModelManager(model: nn.Module, optimizer: torch.optim.Optimizer, device: torch.device, *, train_epochs: Int, train_cutoff_loss: Float, train_batch_size: Int, normalize_measures: 'zero_one' | 'negative_one_one' | None =
None, measures_low: ArrayLike | None =None, measures_high: ArrayLike | None =None, normalize_discount: 'zero_one' | 'negative_one_one' | None =None, discount_low: Float | None =None, discount_high: Float | None =None)[source]¶ Wraps a PyTorch model so it can be used as a discount model.
This class handles operations like training the model to match new discount value targets (in
training_loop()) and performing inference (ininference()).Note
This class assumes all input and output data is of type float32, which is the default type in PyTorch. If different data types are needed, one solution may be to cast the data before/after calls to this class (as is done in
DiscountArchive).Note
This class requires PyTorch to be installed, e.g., by running
pip install torch.- Parameters:¶
- model: nn.Module¶
A PyTorch model that can take in batches of measures and output batches of scalar discount values. We assume this model has already been placed on the desired device.
- optimizer: torch.optim.Optimizer¶
A PyTorch optimizer that is set up to optimize the model’s parameters. We use this to train the discount model to output new discount value targets. The optimizer state is maintained across calls to
training_loop().- device: torch.device¶
A PyTorch device for placing tensors during training.
- train_epochs: Int¶
When
training_loop()is called, the model will train until either (1) the total loss on each epoch is less than thetrain_cutoff_lossdescribed below, or (2) the number of epochs reachestrain_epochs.- train_cutoff_loss: Float¶
See
train_epochs.- train_batch_size: Int¶
During each epoch of
training_loop(), the dataset of measures and targets will be used to train the model with this batch size.- normalize_measures: 'zero_one' | 'negative_one_one' | None =
None¶ Whether to normalize the measures. Pass None (default) to indicate no normalization. Alternatively, pass “zero_one” to normalize to
[0, 1]or “negative_one_one” to normalize to[-1, 1](along each dimension). To normalize to these values, we linearly transform from the range defined bymeasures_lowandmeasures_high, described below.- measures_low: ArrayLike | None =
None¶ If
normalize_measuresis set, this is the lower bound of the measures for normalizing.- measures_high: ArrayLike | None =
None¶ If
normalize_measuresis set, this is the upper bound of the measures for normalizing.- normalize_discount: 'zero_one' | 'negative_one_one' | None =
None¶ Whether to normalize the discount values. Pass None (default) to indicate no normalization. During training, the targets are linearly transformed to a target range such as [0, 1], and during inference, the discount values output by the model are un-normalized before being returned. Pass “zero_one” to set the range to [0, 1], and “negative_one_one” to set the range to [-1, 1].
- discount_low: Float | None =
None¶ If
normalize_discountis set, this is the lower bound of the discount values for normalizing.- discount_high: Float | None =
None¶ If
normalize_discountis set, this is the upper bound of the discount values for normalizing.
Methods
inference(measures[, batch_size])Computes discount values at the given measures using the model.
training_loop(measures, targets)Regresses the discount model to match the given targets at the given measures.
-
inference(measures: numpy.typing.ArrayLike, batch_size: int | None =
None) ndarray[source]¶ Computes discount values at the given measures using the model.
This method also temporarily puts the model in eval mode and uses
torch.no_grad.- Parameters:¶
- measures: numpy.typing.ArrayLike¶
Inputs to the model of size (n_measures, measure_dim).
- batch_size: int | None =
None¶ If passed in, the model will only be passed
batch_sizeinputs at a time. This can be useful if, for instance, the model is very large and there is insufficient memory to handle many inputs simultaneously.
- Returns:¶
The discount values at the input measures.
- training_loop(measures: numpy.typing.ArrayLike, targets: numpy.typing.ArrayLike) list[float][source]¶
Regresses the discount model to match the given targets at the given measures.
Training proceeds until either (1) the total loss on each epoch is less than the
train_cutoff_loss, or (2) the number of epochs reachestrain_epochs. The loss function used during training isMSELoss.- Parameters:¶
- Returns:¶
A list with the total MSE loss accumulated on each epoch, normalized/divided by the size of the dataset. Strictly speaking, the model is updated after every batch is passed through it, so this is not the loss that one would obtain if the measures were all passed through the model at once.