fifteen.experiments._experiment
Simple experiment manager for managing metadata, logs, and checkpoints.
Source: https://github.com/brentyi/dfgo/blob/master/lib/experiment_files.py
Module Contents
Classes
We define an "experiment" as a simple directory, where files associated with some |
Attributes
- fifteen.experiments._experiment.T
- fifteen.experiments._experiment.PytreeType
- fifteen.experiments._experiment.Pytree
- class fifteen.experiments._experiment.Experiment[source]
We define an “experiment” as a simple directory, where files associated with some run of a training script are co-located.
There’s very little real code here; instead we use a common experiment data directory to implement thin wrappers around:
flax.training.checkpointsfor checkpointing.PyYAMLandtyrofor serializing metadata.tensorboardX.SummaryWriterfor logging.
- data_dir :pathlib.Path
- verbose :bool = True
- write_metadata(self, name: str, object: Any) None[source]
Serialize an object as a yaml file, then save it to the experiment’s metadata directory. Includes special handling for dataclasses (via tyro).
- read_metadata(self, name: str, expected_type: Type[T]) T[source]
Load an object from the experiment’s metadata directory. Includes special handling for dataclasses (via tyro).
- save_checkpoint(self, target: Pytree, step: int, prefix: str = 'checkpoint_', keep: int = 1, overwrite: bool = False, keep_every_n_steps: int | None = None) str[source]
Thin wrapper around flax’s
save_checkpoint()function. Returns a file name, as a string.
- restore_checkpoint(self, target: PytreeType, step: int | None = None, prefix: str = 'checkpoint_') PytreeType[source]
Thin wrapper around flax’s
restore_checkpoint()function.
- log(self, log_data: fifteen.experiments.TensorboardLogData, step: int, log_scalars_every_n: int = 1, log_histograms_every_n: int = 1)[source]
Logging helper for Tensorboard.
For TensorboardLogData instances returned from
pmap-transformed functions, seeTensorboardLogData.fix_sharded_scalars().
- assert_new(self) Experiment[source]
Makes sure that there are no existing checkpoints, logs, or metadata. Returns self.
- assert_exists(self) Experiment[source]
Makes sure that there are existing checkpoints, logs, or metadata. Returns self.
- clear(self) Experiment[source]
Deletes
self.data_dir. This clears all checkpoints, logs, and metadata inside of it. Returns self.
- move(self, new_data_dir: pathlib.Path) Experiment[source]
Move all files corresponding to an experiment to a new location. Returns updated Experiment object.