fifteen.experiments._experiment

Simple experiment manager for managing metadata, logs, and checkpoints.

Source: https://github.com/brentyi/dfgo/blob/master/lib/experiment_files.py

Module Contents

Classes

Experiment

We define an "experiment" as a simple directory, where files associated with some

Attributes

T

PytreeType

Pytree

cached_property

fifteen.experiments._experiment.T
fifteen.experiments._experiment.PytreeType
fifteen.experiments._experiment.Pytree
fifteen.experiments._experiment.cached_property[source]
class fifteen.experiments._experiment.Experiment[source]

We define an “experiment” as a simple directory, where files associated with some run of a training script are co-located.

There’s very little real code here; instead we use a common experiment data directory to implement thin wrappers around:

  • flax.training.checkpoints for checkpointing.

  • PyYAML and tyro for serializing metadata.

  • tensorboardX.SummaryWriter for logging.

data_dir :pathlib.Path
verbose :bool = True
write_metadata(self, name: str, object: Any) None[source]

Serialize an object as a yaml file, then save it to the experiment’s metadata directory. Includes special handling for dataclasses (via tyro).

read_metadata(self, name: str, expected_type: Type[T]) T[source]

Load an object from the experiment’s metadata directory. Includes special handling for dataclasses (via tyro).

save_checkpoint(self, target: Pytree, step: int, prefix: str = 'checkpoint_', keep: int = 1, overwrite: bool = False, keep_every_n_steps: int | None = None) str[source]

Thin wrapper around flax’s save_checkpoint() function. Returns a file name, as a string.

restore_checkpoint(self, target: PytreeType, step: int | None = None, prefix: str = 'checkpoint_') PytreeType[source]

Thin wrapper around flax’s restore_checkpoint() function.

log(self, log_data: fifteen.experiments.TensorboardLogData, step: int, log_scalars_every_n: int = 1, log_histograms_every_n: int = 1)[source]

Logging helper for Tensorboard.

For TensorboardLogData instances returned from pmap-transformed functions, see TensorboardLogData.fix_sharded_scalars().

assert_new(self) Experiment[source]

Makes sure that there are no existing checkpoints, logs, or metadata. Returns self.

assert_exists(self) Experiment[source]

Makes sure that there are existing checkpoints, logs, or metadata. Returns self.

clear(self) Experiment[source]

Deletes self.data_dir. This clears all checkpoints, logs, and metadata inside of it. Returns self.

move(self, new_data_dir: pathlib.Path) Experiment[source]

Move all files corresponding to an experiment to a new location. Returns updated Experiment object.