Experiment Management
Overview
- class fannypack.utils.Buddy(experiment_name: str, model: Optional[torch.nn.modules.module.Module] = None, *, checkpoint_dir: str = 'checkpoints', checkpoint_max_to_keep: int = 5, metadata_dir: str = 'metadata', log_dir: str = 'logs', optimizer_type: str = 'adam', optimizer_checkpoint_interval: float = 300, optimizer_names: Optional[List[str]] = None, verbose: bool = True, cpu_only: bool = False)
Bases:
fannypack.utils._buddy_include._checkpointing._BuddyCheckpointing
,fannypack.utils._buddy_include._optimizer._BuddyOptimizer
,fannypack.utils._buddy_include._logging._BuddyLogging
,fannypack.utils._buddy_include._metadata._BuddyMetadata
Buddy is a model manager that abstracts away PyTorch boilerplate. Buddy helps with…
Creating/using/managing optimizers,
Checkpointing (models + optimizers),
Namespaced/scoped Tensorboard logging,
Saving human-readable metadata files.
- Parameters
experiment_name (str) – Name for the current model/experiment. Should not contain hyphens.
model (torch.nn.Module) – PyTorch model to work with.
- Keyword Arguments
checkpoint_dir (str, optional) – Path to save checkpoints into. Defaults to
"checkpoints"
.checkpoint_max_to_keep (int, optional) – Number of auto-saved checkpoints to keep. Set to
None
to keep all. Defaults to5
.metadata_dir (str, optional) – Path to save metadata YAML files into. Defaults to
"metadata"
.log_dir (str, optional) – Path to save Tensorboard log files into. Defaults to
"logs"
.optimizer_type (str, optional) – Optimizer type to use:
"adam"
or"adadelta"
. Defaults to"adam"
.optimizer_checkpoint_interval (float, optional) – How often to auto-checkpoint, as an interval in seconds. Time is computed from the first call to minimize(). Set to 0 to disable. Defaults to 300.
verbose (bool, optional) – Flag for toggling debug messages. Defaults to
True
.cpu_only (bool, optional) – Set to True to turn off auto-detection of cuda support.
- attach_model(model: torch.nn.modules.module.Module) None
Attach a model to our Buddy, and move it onto
buddy.device
.If a model isn’t explicitly passed into the constructor’s
model
field,attach_model
should be called before any optimization, checkpointing, etc happens.- Parameters
model (nn.Module) – Model to attach.
- property device: torch.device
Read-only interface for the active torch device. Auto-detected in the constructor based on CUDA support.
- property model: torch.nn.modules.module.Module
Read-only interface for the attached model. Raises an error if no model is attached.
Checkpointing
- class fannypack.utils._buddy_include._checkpointing._BuddyCheckpointing(checkpoint_dir: str, checkpoint_max_to_keep: int)
Bases:
abc.ABC
Buddy’s model checkpointing interface.
- property checkpoint_labels: List[str]
Accessor for listing available checkpoint labels. These should be saved as:
experiment_name-label.ckpt
in thecheckpoint_dir
directory.- Returns
List[str] – Checkpoint labels, sorted alphabetically.
- load_checkpoint(label: Optional[str] = None, path: Optional[str] = None, experiment_name: Optional[str] = None) None
Loads a checkpoint. By default, loads the one with the highest number of training iterations.
Can also be specified via a label or file path.
- load_checkpoint_module(source: str, target: Optional[str] = None, label: Optional[str] = None, path: Optional[str] = None, experiment_name: Optional[str] = None) None
Loads parameters from a specific child module within a checkpoint. By default, loads the checkpoint with the highest number of training iterations.
Can also be specified via a label or file path.
- load_checkpoint_optimizer(source: str, target: Optional[str] = None, label: Optional[str] = None, path: Optional[str] = None, experiment_name: Optional[str] = None) None
Loads state associated with a specific optimizer from a checkpoint. By default, loads the checkpoint with the highest number of training iterations.
Can also be specified via a label or file path.
- load_checkpoint_optimizers(label=None, path=None, experiment_name=None) None
Loads all optimizer settings from a checkpoint. By default, loads the checkpoint with the highest number of training iterations.
Can also be specified via a label or file path.
- save_checkpoint(label: Optional[str] = None) None
Saves a checkpoint, which can optionally be labeled.
Optimization
- class fannypack.utils._buddy_include._optimizer._BuddyOptimizer(optimizer_type: str, optimizer_checkpoint_interval: float)
Bases:
abc.ABC
Buddy’s optimization interface.
- get_learning_rate(optimizer_name: str = 'primary') float
Gets an optimizer learning rate.
- minimize(loss: torch.Tensor, optimizer_name: str = 'primary', *, retain_graph: bool = False, checkpoint_interval: Optional[float] = None, clip_grad_max_norm: Optional[float] = None) None
Compute gradients and use them to minimize a loss function.
- property optimizer_steps: int
Read-only interface for # of steps taken by optimizer.
- set_default_learning_rate(value: Union[float, Callable[[int], float]]) None
Sets a default learning rate for new optimizers.
- set_learning_rate(value: Union[float, Callable[[int], float]], optimizer_name: str = 'primary') None
Sets an optimizer learning rate. Accepts either a floating point learning rate or a schedule function (int steps -> float LR).
Tensorboard Logging
- class fannypack.utils._buddy_include._logging._BuddyLogging(log_dir: str)
Bases:
abc.ABC
Buddy’s TensorBoard logging interface.
- log_grad_histogram(scope: str = 'grad') None
Log model gradients into a histogram. Should be called after
buddy.minimize()
.Naming: with
scope
set to “grad”, a parameter name “model.Linear.bias” will be logged to the tagbuddy.log_scope_prefix("grad/model/Linear/bias")
.- Parameters
scope (str, optional) – Scope to log gradients into. Defaults to “grad”.
- log_image(name: str, image: Union[torch.Tensor, numpy.ndarray], dataformats: str = 'CHW') None
Convenience function for logging an image tensor for visualization in TensorBoard.
Equivalent to:
buddy.log_writer.add_image( buddy.log_scope_prefix(name), image, buddy.optimizer_steps, dataformats )
- Parameters
name (str) – Identifier for Tensorboard.
image (torch.Tensor or np.ndarray) – Image to log.
dataformats (str, optional) – Dimension ordering. Defaults to “CHW”.
- log_parameters_histogram(scope: str = 'weights', *, ignore_zero_grad: bool = True) None
Log model weights into a histogram.
Naming: with
scope
set to “weights”, a parameter name “model.Linear.bias” will be logged to the tagbuddy.log_scope_prefix("weights/model/Linear/bias")
.- Parameters
scope (str, optional) – Scope to log gradients into. Defaults to “weights”.
ignore_zero_grad (bool, optional) – Ignore parameters without gradients: decreases log sizes when only parts of models are being trained. Defaults to True.
- log_scalar(name: str, value: Union[torch.Tensor, numpy.ndarray, float]) None
Convenience function for logging a scalar for visualization in TensorBoard.
Equivalent to:
buddy.log_writer.add_scalar( buddy.log_scope_prefix(name), value, buddy.optimizer_steps )
- Parameters
name (str) – Identifier for Tensorboard.
value (torch.Tensor, np.ndarray, or float) – Value to log.
- log_scope(scope: str) Generator[None, None, None]
Returns a context manager that scopes log names.
Example usage:
with buddy.log_scope("scope"): # Logs to scope/loss buddy.log_scalar("loss", loss_tensor)
- Parameters
scope (str) – Name of scope.
- log_scope_pop(scope: Optional[str] = None) None
Pop a scope we logged tensors into. See
log_scope_push()
.- Parameters
scope (str, optional) – Name of scope. Needs to be the top one in the stack.
- log_scope_prefix(name: str = '') str
Get or apply the current log scope prefix.
Example usage:
print(buddy.log_scope_prefix()) # "" with buddy.log_scope("scope0"): print(buddy.log_scope_prefix("loss")) # "scope0/loss" with buddy.log_scope("scope1"): print(buddy.log_scope_prefix()) # "scope0/scope1/"
- Parameters
name (str, optional) – Name to prepend a prefix to. Defaults to an empty string.
- Returns
str – Scoped log name, or scope prefix if input is empty.
- log_scope_push(scope: str) None
Push a scope to log tensors into.
Example usage:
buddy.log_scope_push("scope") # Logs to scope/loss buddy.log_scalar("loss", loss_tensor) buddy.log_scope_pop("scope") # name parameter is optional :param scope: Name of scope. :type scope: str
- property log_writer: torch.utils.tensorboard.writer.SummaryWriter
Accessor for standard Tensorboard SummaryWriter. Instantiated lazily.
Experiment Metadata
- class fannypack.utils._buddy_include._metadata._BuddyMetadata(metadata_dir: str)
Bases:
abc.ABC
Buddy’s experiment metadata management interface.
- add_metadata(content: Dict[str, Any]) None
Add human-readable metadata for this experiment. Input should be a dictionary that is merged with existing metadata.
- load_metadata(experiment_name: Optional[str] = None, metadata_dir: Optional[str] = None, path: Optional[str] = None, _write=True) None
Read existing metadata file. Note that metadata is loaded automatically: this only needs to be called if loading across experiments.
Overwrites existing metadata.
- property metadata: Dict[str, Any]
Read-only interface for experiment metadata.
- property metadata_path: str
Read-only path to my metadata file.
- set_metadata(content: Dict[str, Any]) None
Assign human-readable metadata for this experiment. Input should be a dictionary that replaces existing metadata.
Command-line Interface
Buddy’s CLI currently supports four primary functions:
buddy delete [experiment_name]
: Delete an existing experiment. Displays a selection menu with metadata preview if no experiment name is passed in.buddy info {experiment_name}
: Print summary + metadata of an existing experiment.buddy list
: Print table of existing experiments + basic information.buddy rename {source} {dest}
: Rename an existing experiment.
For more details and a full list of options, run buddy {subcommand} --help
.
—
The Buddy CLI also has full support for autcompleting experiment names. This needs to be registered in your .bashrc to be enabled:
# Append to .bashrc
eval "$(register-python-argcomplete buddy)"
Alternatively, for zsh:
# Append to .zshrc
autoload -U +X compinit && compinit
autoload -U +X bashcompinit && bashcompinit
eval "$(register-python-argcomplete buddy)"