fannypack.data

Google Drive Downloads
HDF5 for Trajectories

Google Drive Downloads 

fannypack.data.download_drive_file(url: str, target_path: str, chunk_size=32768) → None

Download a file via a public Google Drive url.

Example usage:

download_file_from_google_drive(
    "https://drive.google.com/file/d/1AsY9Cs3xE0RSlr0FKlnSKHp6zIwFSvXe/view",
    "/home/brent/Downloads/test.pdf"
)

Parameters

url (str) – Google Drive url.
target_path (str) – Destination to write to.

fannypack.data.cached_drive_file(name: str, url: str) → str

Return a local path to a file from Google Drive. Downloads the file if it doesn’t exist yet locally.

By default, cached files live in ~/.cache/fannypack-drive-files/. It often makes sense to move this directory (eg to an NFS): see fannypack.data.set_cache_path().

Parameters

name (str) – Name of path, eg secret_key.pem.
url (str) – URL, eg https://drive.google.com/file/d/1AsY9Cs3xE0RSlr0FKlnSKHp6zIwFSvXe/view.

Returns

str – Local path to file.

fannypack.data.set_cache_path(path: str)

Set the cache location for fannypack.data.cached_drive_file().

Parameters: _cache_path (str) – New location for cached files. Defaults to ~/.cache/fannypack-drive-files/.

HDF5 for Trajectories 

class fannypack.data.TrajectoriesFile(path: str, convert_doubles: bool = True, read_only: bool = True, compress: bool = True, verbose: bool = True)

Bases: Iterable

An interface for reading/writing trajectories via h5py.

Each TrajectoriesFile represents an iterable list of trajectories, where trajectores are stored as dictionaries that map str keys to np.ndarray contents.

Example usage (read):

with TrajectoriesFile('test.hdf5') as traj_file:

    for traj in traj_file:
        print(traj.keys()) # list of keys
        print(traj['some-key-name']) # numpy array

Example usage (write):

traj_file = TrajectoriesFile('test.hdf5', read_only=False)

traj_file.add_meta({'label': 5})
traj_file.add_timestep({'a': 1, 'b': 2})
traj_file.add_timestep({'a': 3, 'b': 4})

with traj_file:
    traj_file.complete_trajectory()

print(len(traj_file)) # 1 trajectory!

with traj_file:
    print(traj_file[0]['label']) # 5
    print(traj_file[0]['a']) # [1, 3]
    print(traj_file[0]['b']) # [2, 4]

Note that some operations – ones that require interfacing with the filesytem – need to be called within a with statement.

Parameters

path (str) – File path for this trajectory file.
convert_doubles (bool) – Convert doubles to floats to shrink files.
read_only (bool, optional) – Open file in read-only mode.
compress (bool, optional) – Reduce filesize w/ gzip.
verbose (bool, optional) – Enable debug prints.

abandon_trajectory() → None: Abandon the current trajectory.

add_meta(content: Dict[str, numpy.ndarray]) → None

Add some metadata to the current trajectory.

Parameters: content (dict) – Map from metadata keys (str) to values (np.ndarray).

add_timestep(content: Dict[str, numpy.ndarray]) → None

Add a timestep to the current trajectory.

Parameters: content (dict) – Map from timestep keys (str) to values (np.ndarray).

clear() → None: Clear the contents of the TrajectoriesFile.

complete_trajectory() → None

Write the current trajectory to disk, and mark the start of a new trajectory. Must be called with the TrajectoriesFile object in a with statement.

The next call to add_timestep() will be time 0 of the next trajectory.

get_all(key: str) → list

Get contents associated with a key from all trajectories.

Parameters: key (str) – Content identifier.
Returns: list – List of contents. First index is trajectory #.

resize(count: int): Expand or contract our TrajectoriesFile.

fannypack.data

Google Drive Downloads

HDF5 for Trajectories

Google Drive Downloads 

HDF5 for Trajectories 