fannypack.data
Google Drive Downloads
- fannypack.data.download_drive_file(url: str, target_path: str, chunk_size=32768) None
Download a file via a public Google Drive url.
Example usage:
download_file_from_google_drive( "https://drive.google.com/file/d/1AsY9Cs3xE0RSlr0FKlnSKHp6zIwFSvXe/view", "/home/brent/Downloads/test.pdf" )
- Parameters
url (str) – Google Drive url.
target_path (str) – Destination to write to.
- fannypack.data.cached_drive_file(name: str, url: str) str
Return a local path to a file from Google Drive. Downloads the file if it doesn’t exist yet locally.
By default, cached files live in
~/.cache/fannypack-drive-files/
. It often makes sense to move this directory (eg to an NFS): seefannypack.data.set_cache_path()
.- Parameters
name (str) – Name of path, eg
secret_key.pem
.url (str) – URL, eg
https://drive.google.com/file/d/1AsY9Cs3xE0RSlr0FKlnSKHp6zIwFSvXe/view
.
- Returns
str – Local path to file.
- fannypack.data.set_cache_path(path: str)
Set the cache location for
fannypack.data.cached_drive_file()
.- Parameters
_cache_path (str) – New location for cached files. Defaults to
~/.cache/fannypack-drive-files/
.
HDF5 for Trajectories
- class fannypack.data.TrajectoriesFile(path: str, convert_doubles: bool = True, read_only: bool = True, compress: bool = True, verbose: bool = True)
Bases:
Iterable
An interface for reading/writing trajectories via h5py.
Each TrajectoriesFile represents an iterable list of trajectories, where trajectores are stored as dictionaries that map
str
keys tonp.ndarray
contents.Example usage (read):
with TrajectoriesFile('test.hdf5') as traj_file: for traj in traj_file: print(traj.keys()) # list of keys print(traj['some-key-name']) # numpy array
Example usage (write):
traj_file = TrajectoriesFile('test.hdf5', read_only=False) traj_file.add_meta({'label': 5}) traj_file.add_timestep({'a': 1, 'b': 2}) traj_file.add_timestep({'a': 3, 'b': 4}) with traj_file: traj_file.complete_trajectory() print(len(traj_file)) # 1 trajectory! with traj_file: print(traj_file[0]['label']) # 5 print(traj_file[0]['a']) # [1, 3] print(traj_file[0]['b']) # [2, 4]
Note that some operations – ones that require interfacing with the filesytem – need to be called within a
with
statement.- Parameters
path (str) – File path for this trajectory file.
convert_doubles (bool) – Convert doubles to floats to shrink files.
read_only (bool, optional) – Open file in read-only mode.
compress (bool, optional) – Reduce filesize w/ gzip.
verbose (bool, optional) – Enable debug prints.
- abandon_trajectory() None
Abandon the current trajectory.
- add_meta(content: Dict[str, numpy.ndarray]) None
Add some metadata to the current trajectory.
- Parameters
content (dict) – Map from metadata keys (str) to values (np.ndarray).
- add_timestep(content: Dict[str, numpy.ndarray]) None
Add a timestep to the current trajectory.
- Parameters
content (dict) – Map from timestep keys (str) to values (np.ndarray).
- clear() None
Clear the contents of the TrajectoriesFile.
- complete_trajectory() None
Write the current trajectory to disk, and mark the start of a new trajectory. Must be called with the TrajectoriesFile object in a
with
statement.The next call to
add_timestep()
will be time 0 of the next trajectory.
- get_all(key: str) list
Get contents associated with a key from all trajectories.
- Parameters
key (str) – Content identifier.
- Returns
list – List of contents. First index is trajectory #.
- resize(count: int)
Expand or contract our TrajectoriesFile.