Module 

Returns:

The path of the file that has been downloaded (or already exists)

ensure_csv(*subkeys, url, name=None, force=False, download_kwargs=None, read_csv_kwargs=None)[source]

Download a CSV and open as a dataframe with pandas.

Parameters:

subkeys (str) – A sequence of additional strings to join. If none are given, returns the directory for this module.
url (str) – The URL to download.
name (Optional[str]) – Overrides the name of the file at the end of the URL, if given. Also useful for URLs that don’t have proper filenames with extensions.
force (bool) – Should the download be done again, even if the path already exists? Defaults to false.
download_kwargs (Optional[Mapping[str, Any]]) – Keyword arguments to pass through to pystow.utils.download().
read_csv_kwargs (Optional[Mapping[str, Any]]) – Keyword arguments to pass through to pandas.read_csv().

Returns:

A pandas DataFrame

Return type:

pandas.DataFrame

ensure_custom(*subkeys, name, force=False, provider, **kwargs)[source]

Ensure a file is present, and run a custom create function otherwise.

Parameters:

subkeys (str) – A sequence of additional strings to join. If none are given, returns the directory for this module.
name (str) – The file name.
force (bool) – Should the file be re-created, even if the path already exists?
provider (Callable[..., None]) – The file provider. Will be run with the path as the first positional argument, if the file needs to be generated.
kwargs – Additional keyword-based parameters passed to the provider.

Raises:

ValueError – If the provider was called but the file was not created by it.

Return type:

Returns:

The path of the file that has been created (or already exists)

ensure_excel(*subkeys, url, name=None, force=False, download_kwargs=None, read_excel_kwargs=None)[source]

Download an excel file and open as a dataframe with pandas.

Parameters:

subkeys (str) – A sequence of additional strings to join. If none are given, returns the directory for this module.
url (str) – The URL to download.
name (Optional[str]) – Overrides the name of the file at the end of the URL, if given. Also useful for URLs that don’t have proper filenames with extensions.
force (bool) – Should the download be done again, even if the path already exists? Defaults to false.
download_kwargs (Optional[Mapping[str, Any]]) – Keyword arguments to pass through to pystow.utils.download().
read_excel_kwargs (Optional[Mapping[str, Any]]) – Keyword arguments to pass through to pandas.read_excel().

Return type:

DataFrame

Returns:

A pandas DataFrame

ensure_from_google(*subkeys, name, file_id, force=False, download_kwargs=None)[source]

Ensure a file is downloaded from Google Drive.

Parameters:

subkeys (str) – A sequence of additional strings to join. If none are given, returns the directory for this module.
name (str) – The name of the file
file_id (str) – The file identifier of the google file. If your share link is https://drive.google.com/file/d/1AsPPU4ka1Rc9u-XYMGWtvV65hF3egi0z/view, then your file id is 1AsPPU4ka1Rc9u-XYMGWtvV65hF3egi0z.
force (bool) – Should the download be done again, even if the path already exists? Defaults to false.
download_kwargs (Optional[Mapping[str, Any]]) – Keyword arguments to pass through to pystow.utils.download_from_google().

Return type:

Returns:

The path of the file that has been downloaded (or already exists)

ensure_from_s3(*subkeys, s3_bucket, s3_key, name=None, client=None, client_kwargs=None, download_file_kwargs=None, force=False)[source]

Ensure a file is downloaded.

Parameters:

subkeys (str) – A sequence of additional strings to join. If none are given, returns the directory for this module.
s3_bucket (str) – The S3 bucket name
s3_key (Union[str, Sequence[str]]) – The S3 key name
name (Optional[str]) – Overrides the name of the file at the end of the S3 key, if given.
client (Optional[BaseClient]) – A botocore client. If none given, one will be created automatically
client_kwargs (Optional[Mapping[str, Any]]) – Keyword arguments to be passed to the client on instantiation.
download_file_kwargs (Optional[Mapping[str, Any]]) – Keyword arguments to be passed to boto3.s3.transfer.S3Transfer.download_file()
force (bool) – Should the download be done again, even if the path already exists? Defaults to false.

Return type:

Returns:

The path of the file that has been downloaded (or already exists)

ensure_gunzip(*subkeys, url, name=None, force=False, autoclean=True, download_kwargs=None)[source]

Ensure a tar.gz file is downloaded and unarchived.

Parameters:

subkeys (str) – A sequence of additional strings to join. If none are given, returns the directory for this module.
url (str) – The URL to download.
name (Optional[str]) – Overrides the name of the file at the end of the URL, if given. Also useful for URLs that don’t have proper filenames with extensions.
force (bool) – Should the download be done again, even if the path already exists? Defaults to false.
autoclean (bool) – Should the zipped file be deleted?
download_kwargs (Optional[Mapping[str, Any]]) – Keyword arguments to pass through to pystow.utils.download().

Return type:

Returns:

The path of the directory where the file that has been downloaded gets extracted to

ensure_json(*subkeys, url, name=None, force=False, download_kwargs=None, open_kwargs=None, json_load_kwargs=None)[source]

Download JSON and open with json.

Parameters:

subkeys (str) – A sequence of additional strings to join. If none are given, returns the directory for this module.
url (str) – The URL to download.
name (Optional[str]) – Overrides the name of the file at the end of the URL, if given. Also useful for URLs that don’t have proper filenames with extensions.
force (bool) – Should the download be done again, even if the path already exists? Defaults to false.
download_kwargs (Optional[Mapping[str, Any]]) – Keyword arguments to pass through to pystow.utils.download().
open_kwargs (Optional[Mapping[str, Any]]) – Additional keyword arguments passed to open()
json_load_kwargs (Optional[Mapping[str, Any]]) – Keyword arguments to pass through to json.load().

Return type:

Returns:

A JSON object (list, dict, etc.)

ensure_json_bz2(*subkeys, url, name=None, force=False, download_kwargs=None, open_kwargs=None, json_load_kwargs=None)[source]

Download BZ2-compressed JSON and open with json.

Parameters:

subkeys (str) – A sequence of additional strings to join. If none are given, returns the directory for this module.
url (str) – The URL to download.
name (Optional[str]) – Overrides the name of the file at the end of the URL, if given. Also useful for URLs that don’t have proper filenames with extensions.
force (bool) – Should the download be done again, even if the path already exists? Defaults to false.
download_kwargs (Optional[Mapping[str, Any]]) – Keyword arguments to pass through to pystow.utils.download().
open_kwargs (Optional[Mapping[str, Any]]) – Additional keyword arguments passed to bz2.open()
json_load_kwargs (Optional[Mapping[str, Any]]) – Keyword arguments to pass through to json.load().

Returns:

A JSON object (list, dict, etc.)

ensure_open(*subkeys, url, name=None, force=False, download_kwargs=None, mode='r', open_kwargs=None)[source]

Ensure a file is downloaded and open it.

Parameters:

subkeys (str) – A sequence of additional strings to join. If none are given, returns the directory for this module.
url (str) – The URL to download.
name (Optional[str]) – Overrides the name of the file at the end of the URL, if given. Also useful for URLs that don’t have proper filenames with extensions.
force (bool) – Should the download be done again, even if the path already exists? Defaults to false.
download_kwargs (Optional[Mapping[str, Any]]) – Keyword arguments to pass through to pystow.utils.download().
mode (str) – The read mode, passed to open()
open_kwargs (Optional[Mapping[str, Any]]) – Additional keyword arguments passed to open()

Yields:

An open file object

Return type:

ensure_open_bz2(*subkeys, url, name=None, force=False, download_kwargs=None, mode='rb', open_kwargs=None)[source]

Ensure a BZ2-compressed file is downloaded and open a file inside it.

Parameters:

subkeys (str) – A sequence of additional strings to join. If none are given, returns the directory for this module.
url (str) – The URL to download.
name (Optional[str]) – Overrides the name of the file at the end of the URL, if given. Also useful for URLs that don’t have proper filenames with extensions.
force (bool) – Should the download be done again, even if the path already exists? Defaults to false.
download_kwargs (Optional[Mapping[str, Any]]) – Keyword arguments to pass through to pystow.utils.download().
mode (str) – The read mode, passed to bz2.open()
open_kwargs (Optional[Mapping[str, Any]]) – Additional keyword arguments passed to bz2.open()

Yields:

An open file object

Return type:

ensure_open_gz(*subkeys, url, name=None, force=False, download_kwargs=None, mode='rb', open_kwargs=None)[source]

Ensure a gzipped file is downloaded and open a file inside it.

Parameters:

subkeys (str) – A sequence of additional strings to join. If none are given, returns the directory for this module.
url (str) – The URL to download.
name (Optional[str]) – Overrides the name of the file at the end of the URL, if given. Also useful for URLs that don’t have proper filenames with extensions.
force (bool) – Should the download be done again, even if the path already exists? Defaults to false.
download_kwargs (Optional[Mapping[str, Any]]) – Keyword arguments to pass through to pystow.utils.download().
mode (str) – The read mode, passed to gzip.open()
open_kwargs (Optional[Mapping[str, Any]]) – Additional keyword arguments passed to gzip.open()

Yields:

An open file object

Return type:

ensure_open_lzma(*subkeys, url, name=None, force=False, download_kwargs=None, mode='rt', open_kwargs=None)[source]

Ensure a LZMA-compressed file is downloaded and open a file inside it.

Parameters:

subkeys (str) – A sequence of additional strings to join. If none are given, returns the directory for this module.
url (str) – The URL to download.
name (Optional[str]) – Overrides the name of the file at the end of the URL, if given. Also useful for URLs that don’t have proper filenames with extensions.
force (bool) – Should the download be done again, even if the path already exists? Defaults to false.
download_kwargs (Optional[Mapping[str, Any]]) – Keyword arguments to pass through to pystow.utils.download().
mode (str) – The read mode, passed to lzma.open()
open_kwargs (Optional[Mapping[str, Any]]) – Additional keyword arguments passed to lzma.open()

Yields:

An open file object

Return type:

ensure_open_sqlite(*subkeys, url, name=None, force=False, download_kwargs=None)[source]

Ensure and connect to a SQLite database.

Parameters:

subkeys (str) – A sequence of additional strings to join. If none are given, returns the directory for this module.
url (str) – The URL to download.
name (Optional[str]) – Overrides the name of the file at the end of the URL, if given. Also useful for URLs that don’t have proper filenames with extensions.
force (bool) – Should the download be done again, even if the path already exists? Defaults to false.
download_kwargs (Optional[Mapping[str, Any]]) – Keyword arguments to pass through to pystow.utils.download().

Yields:

An instance of sqlite3.Connection from sqlite3.connect()

Example usage: >>> import pystow >>> import pandas as pd >>> url = “https://s3.amazonaws.com/bbop-sqlite/hp.db” >>> sql = “SELECT * FROM entailed_edge LIMIT 10” >>> module = pystow.module(“test”) >>> with module.ensure_open_sqlite(url=url) as conn: >>> df = pd.read_sql(sql, conn)

ensure_open_sqlite_gz(*subkeys, url, name=None, force=False, download_kwargs=None)[source]

Ensure and connect to a SQLite database that’s gzipped.

Unfortunately, it’s a paid feature to directly read gzipped sqlite files, so this automatically gunzips it first.

Parameters:

subkeys (str) – A sequence of additional strings to join. If none are given, returns the directory for this module.
url (str) – The URL to download.
name (Optional[str]) – Overrides the name of the file at the end of the URL, if given. Also useful for URLs that don’t have proper filenames with extensions.
force (bool) – Should the download be done again, even if the path already exists? Defaults to false.
download_kwargs (Optional[Mapping[str, Any]]) – Keyword arguments to pass through to pystow.utils.download().

Yields:

An instance of sqlite3.Connection from sqlite3.connect()

Example usage: >>> import pystow >>> import pandas as pd >>> url = “https://s3.amazonaws.com/bbop-sqlite/hp.db.gz” >>> module = pystow.module(“test”) >>> sql = “SELECT * FROM entailed_edge LIMIT 10” >>> with module.ensure_open_sqlite_gz(url=url) as conn: >>> df = pd.read_sql(sql, conn)

ensure_open_tarfile(*subkeys, url, inner_path, name=None, force=False, download_kwargs=None, mode='r', open_kwargs=None)[source]

Ensure a tar file is downloaded and open a file inside it.

Parameters:

subkeys (str) – A sequence of additional strings to join. If none are given, returns the directory for this module.
url (str) – The URL to download.
inner_path (str) – The relative path to the file inside the archive
name (Optional[str]) – Overrides the name of the file at the end of the URL, if given. Also useful for URLs that don’t have proper filenames with extensions.
force (bool) – Should the download be done again, even if the path already exists? Defaults to false.
download_kwargs (Optional[Mapping[str, Any]]) – Keyword arguments to pass through to pystow.utils.download().
mode (str) – The read mode, passed to tarfile.open()
open_kwargs (Optional[Mapping[str, Any]]) – Additional keyword arguments passed to tarfile.open()

Yields:

An open file object

Return type:

ensure_open_zip(*subkeys, url, inner_path, name=None, force=False, download_kwargs=None, mode='r', open_kwargs=None)[source]

Ensure a file is downloaded then open it with zipfile.

Parameters:

subkeys (str) – A sequence of additional strings to join. If none are given, returns the directory for this module.
url (str) – The URL to download.
inner_path (str) – The relative path to the file inside the archive
name (Optional[str]) – Overrides the name of the file at the end of the URL, if given. Also useful for URLs that don’t have proper filenames with extensions.
force (bool) – Should the download be done again, even if the path already exists? Defaults to false.
download_kwargs (Optional[Mapping[str, Any]]) – Keyword arguments to pass through to pystow.utils.download().
mode (str) – The read mode, passed to zipfile.open()
open_kwargs (Optional[Mapping[str, Any]]) – Additional keyword arguments passed to zipfile.open()

Yields:

An open file object

Return type:

ensure_pickle(*subkeys, url, name=None, force=False, download_kwargs=None, mode='rb', open_kwargs=None, pickle_load_kwargs=None)[source]

Download a pickle file and open with pickle.

Parameters:

subkeys (str) – A sequence of additional strings to join. If none are given, returns the directory for this module.
url (str) – The URL to download.
name (Optional[str]) – Overrides the name of the file at the end of the URL, if given. Also useful for URLs that don’t have proper filenames with extensions.
force (bool) – Should the download be done again, even if the path already exists? Defaults to false.
download_kwargs (Optional[Mapping[str, Any]]) – Keyword arguments to pass through to pystow.utils.download().
mode (str) – The read mode, passed to open()
open_kwargs (Optional[Mapping[str, Any]]) – Additional keyword arguments passed to open()
pickle_load_kwargs (Optional[Mapping[str, Any]]) – Keyword arguments to pass through to pickle.load().

Return type:

Returns:

Any object

ensure_pickle_gz(*subkeys, url, name=None, force=False, download_kwargs=None, mode='rb', open_kwargs=None, pickle_load_kwargs=None)[source]

Download a gzipped pickle file and open with pickle.

Parameters:

subkeys (str) – A sequence of additional strings to join. If none are given, returns the directory for this module.
url (str) – The URL to download.
name (Optional[str]) – Overrides the name of the file at the end of the URL, if given. Also useful for URLs that don’t have proper filenames with extensions.
force (bool) – Should the download be done again, even if the path already exists? Defaults to false.
download_kwargs (Optional[Mapping[str, Any]]) – Keyword arguments to pass through to pystow.utils.download().
mode (str) – The read mode, passed to gzip.open()
open_kwargs (Optional[Mapping[str, Any]]) – Additional keyword arguments passed to gzip.open()
pickle_load_kwargs (Optional[Mapping[str, Any]]) – Keyword arguments to pass through to pickle.load().

Return type:

Returns:

Any object

ensure_rdf(*subkeys, url, name=None, force=False, download_kwargs=None, precache=True, parse_kwargs=None)[source]

Download a RDF file and open with rdflib.

Parameters:

subkeys (str) – A sequence of additional strings to join. If none are given, returns the directory for this module.
url (str) – The URL to download.
name (Optional[str]) – Overrides the name of the file at the end of the URL, if given. Also useful for URLs that don’t have proper filenames with extensions.
force (bool) – Should the download be done again, even if the path already exists? Defaults to false.
download_kwargs (Optional[Mapping[str, Any]]) – Keyword arguments to pass through to pystow.utils.download().
precache (bool) – Should the parsed rdflib.Graph be stored as a pickle for fast loading?
parse_kwargs (Optional[Mapping[str, Any]]) – Keyword arguments to pass through to pystow.utils.read_rdf() and transitively to rdflib.Graph.parse().

Returns:

An RDF graph

Return type:

rdflib.Graph

ensure_tar_df(*subkeys, url, inner_path, name=None, force=False, download_kwargs=None, read_csv_kwargs=None)[source]

Download a tar file and open an inner file as a dataframe with pandas.

Parameters:

subkeys (str) – A sequence of additional strings to join. If none are given, returns the directory for this module.
url (str) – The URL to download.
inner_path (str) – The relative path to the file inside the archive
name (Optional[str]) – Overrides the name of the file at the end of the URL, if given. Also useful for URLs that don’t have proper filenames with extensions.
force (bool) – Should the download be done again, even if the path already exists? Defaults to false.
download_kwargs (Optional[Mapping[str, Any]]) – Keyword arguments to pass through to pystow.utils.download().
read_csv_kwargs (Optional[Mapping[str, Any]]) – Keyword arguments to pass through to pandas.read_csv().

Return type:

DataFrame

Returns:

A dataframe

Warning

If you have lots of files to read in the same archive, it’s better just to unzip first.

ensure_tar_xml(*subkeys, url, inner_path, name=None, force=False, download_kwargs=None, parse_kwargs=None)[source]

Download a tar file and open an inner file as an XML with lxml.

Parameters:

subkeys (str) – A sequence of additional strings to join. If none are given, returns the directory for this module.
url (str) – The URL to download.
inner_path (str) – The relative path to the file inside the archive
name (Optional[str]) – Overrides the name of the file at the end of the URL, if given. Also useful for URLs that don’t have proper filenames with extensions.
force (bool) – Should the download be done again, even if the path already exists? Defaults to false.
download_kwargs (Optional[Mapping[str, Any]]) – Keyword arguments to pass through to pystow.utils.download().
parse_kwargs (Optional[Mapping[str, Any]]) – Keyword arguments to pass through to lxml.etree.parse().

Returns:

An ElementTree object

Warning

If you have lots of files to read in the same archive, it’s better just to unzip first.

ensure_untar(*subkeys, url, name=None, directory=None, force=False, download_kwargs=None, extract_kwargs=None)[source]

Ensure a tar file is downloaded and unarchived.

Parameters:

subkeys (str) – A sequence of additional strings to join. If none are given, returns the directory for this module.
url (str) – The URL to download.
name (Optional[str]) – Overrides the name of the file at the end of the URL, if given. Also useful for URLs that don’t have proper filenames with extensions.
directory (Optional[str]) – Overrides the name of the directory into which the tar archive is extracted. If none given, will use the stem of the file name that gets downloaded.
force (bool) – Should the download be done again, even if the path already exists? Defaults to false.
download_kwargs (Optional[Mapping[str, Any]]) – Keyword arguments to pass through to pystow.utils.download().
extract_kwargs (Optional[Mapping[str, Any]]) – Keyword arguments to pass to tarfile.TarFile.extract_all().

Return type:

Returns:

The path of the directory where the file that has been downloaded gets extracted to

ensure_xml(*subkeys, url, name=None, force=False, download_kwargs=None, parse_kwargs=None)[source]

Download an XML file and open it with lxml.

Parameters:

subkeys (str) – A sequence of additional strings to join. If none are given, returns the directory for this module.
url (str) – The URL to download.
name (Optional[str]) – Overrides the name of the file at the end of the URL, if given. Also useful for URLs that don’t have proper filenames with extensions.
force (bool) – Should the download be done again, even if the path already exists? Defaults to false.
download_kwargs (Optional[Mapping[str, Any]]) – Keyword arguments to pass through to pystow.utils.download().
parse_kwargs (Optional[Mapping[str, Any]]) – Keyword arguments to pass through to lxml.etree.parse().

Return type:

ElementTree

Returns:

An ElementTree object

Warning

If you have lots of files to read in the same archive, it’s better just to unzip first.

ensure_zip_df(*subkeys, url, inner_path, name=None, force=False, download_kwargs=None, read_csv_kwargs=None)[source]

Download a zip file and open an inner file as a dataframe with pandas.

Parameters:

subkeys (str) – A sequence of additional strings to join. If none are given, returns the directory for this module.
url (str) – The URL to download.
inner_path (str) – The relative path to the file inside the archive
name (Optional[str]) – Overrides the name of the file at the end of the URL, if given. Also useful for URLs that don’t have proper filenames with extensions.
force (bool) – Should the download be done again, even if the path already exists? Defaults to false.
download_kwargs (Optional[Mapping[str, Any]]) – Keyword arguments to pass through to pystow.utils.download().
read_csv_kwargs (Optional[Mapping[str, Any]]) – Keyword arguments to pass through to pandas.read_csv().

Returns:

A pandas DataFrame

Return type:

pandas.DataFrame

ensure_zip_np(*subkeys, url, inner_path, name=None, force=False, download_kwargs=None, load_kwargs=None)[source]

Download a zip file and open an inner file as an array-like with numpy.

Parameters:

subkeys (str) – A sequence of additional strings to join. If none are given, returns the directory for this module.
url (str) – The URL to download.
inner_path (str) – The relative path to the file inside the archive
name (Optional[str]) – Overrides the name of the file at the end of the URL, if given. Also useful for URLs that don’t have proper filenames with extensions.
force (bool) – Should the download be done again, even if the path already exists? Defaults to false.
download_kwargs (Optional[Mapping[str, Any]]) – Keyword arguments to pass through to pystow.utils.download().
load_kwargs (Optional[Mapping[str, Any]]) – Additional keyword arguments that are passed through to read_zip_np() and transitively to numpy.load().

Returns:

An array-like object

Return type:

numpy.typing.ArrayLike

classmethod from_key(key, *subkeys, ensure_exists=True)[source]

Get a module for the given directory or one of its subdirectories.

Parameters:

key (str) – The name of the module. No funny characters. The envvar <key>_HOME where key is uppercased is checked first before using the default home directory.
subkeys (str) – A sequence of additional strings to join. If none are given, returns the directory for this module.
ensure_exists (bool) – Should all directories be created automatically? Defaults to true.

Return type:

Module

Returns:

A module

join(*subkeys, name=None, ensure_exists=True)[source]

Get a subdirectory of the current module.

Parameters:

subkeys (str) – A sequence of additional strings to join. If none are given, returns the directory for this module.
ensure_exists (bool) – Should all directories be created automatically? Defaults to true.
name (Optional[str]) – The name of the file (optional) inside the folder

Return type:

Returns:

The path of the directory or subdirectory for the given module.

joinpath_sqlite(*subkeys, name)[source]

Get an SQLite database connection string.

Parameters:

subkeys (str) – A sequence of additional strings to join. If none are given, returns the directory for this module.
name (str) – The name of the database file.

Return type:

str

Returns:

A SQLite path string.

load_df(*subkeys, name, read_csv_kwargs=None)[source]

Open a pre-existing CSV as a dataframe with pandas.

Parameters:

subkeys (str) – A sequence of additional strings to join. If none are given, returns the directory for this module.
name (str) – Overrides the name of the file at the end of the URL, if given. Also useful for URLs that don’t have proper filenames with extensions.
read_csv_kwargs (Optional[Mapping[str, Any]]) – Keyword arguments to pass through to pandas.read_csv().

Return type:

DataFrame

Returns:

A pandas DataFrame

load_json(*subkeys, name, open_kwargs=None, json_load_kwargs=None)[source]

Open a JSON file json.

Parameters:

subkeys (str) – A sequence of additional strings to join. If none are given, returns the directory for this module.
name (str) – The name of the file to open
open_kwargs (Optional[Mapping[str, Any]]) – Additional keyword arguments passed to open()
json_load_kwargs (Optional[Mapping[str, Any]]) – Keyword arguments to pass through to json.load().

Return type:

Returns:

A JSON object (list, dict, etc.)

load_pickle(*subkeys, name, mode='rb', open_kwargs=None, pickle_load_kwargs=None)[source]

Open a pickle file with pickle.

Parameters:

subkeys (str) – A sequence of additional strings to join. If none are given, returns the directory for this module.
name (str) – The name of the file to open
mode (str) – The read mode, passed to open()
open_kwargs (Optional[Mapping[str, Any]]) – Additional keyword arguments passed to open()
pickle_load_kwargs (Optional[Mapping[str, Any]]) – Keyword arguments to pass through to pickle.load().

Return type:

Returns:

Any object

load_pickle_gz(*subkeys, name, mode='rb', open_kwargs=None, pickle_load_kwargs=None)[source]

Open a gzipped pickle file with pickle.

Parameters:

subkeys (str) – A sequence of additional strings to join. If none are given, returns the directory for this module.
name (str) – The name of the file to open
mode (str) – The read mode, passed to open()
open_kwargs (Optional[Mapping[str, Any]]) – Additional keyword arguments passed to gzip.open()
pickle_load_kwargs (Optional[Mapping[str, Any]]) – Keyword arguments to pass through to pickle.load().

Return type:

Returns:

Any object

load_rdf(*subkeys, name=None, parse_kwargs=None)[source]

Open an RDF file with rdflib.

Parameters:

subkeys (str) – A sequence of additional strings to join. If none are given, returns the directory for this module.
name (Optional[str]) – The name of the file to open
parse_kwargs (Optional[Mapping[str, Any]]) – Keyword arguments to pass through to pystow.utils.read_rdf() and transitively to rdflib.Graph.parse().

Return type:

Graph

Returns:

An RDF graph

load_xml(*subkeys, name, parse_kwargs=None)[source]

Load an XML file with lxml.

Parameters:

subkeys (str) – A sequence of additional strings to join. If none are given, returns the directory for this module.
name (str) – The name of the file to open
parse_kwargs (Optional[Mapping[str, Any]]) – Keyword arguments to pass through to lxml.etree.parse().

Return type:

ElementTree

Returns:

An ElementTree object

Warning

If you have lots of files to read in the same archive, it’s better just to unzip first.

module(*subkeys, ensure_exists=True)[source]

Get a module for a subdirectory of the current module.

Parameters:

subkeys (str) – A sequence of additional strings to join. If none are given, returns the directory for this module.
ensure_exists (bool) – Should all directories be created automatically? Defaults to true.

Return type:

Module

Returns:

A module representing the subdirectory based on the given subkeys.

open(*subkeys, name, mode='r', open_kwargs=None, ensure_exists=False)[source]

Open a file that exists already.

Parameters:

subkeys (str) – A sequence of additional strings to join. If none are given, returns the directory for this module.
name (str) – The name of the file to open
mode (str) – The read mode, passed to open()
open_kwargs (Optional[Mapping[str, Any]]) – Additional keyword arguments passed to open()
ensure_exists (bool) – Should the file be made? Set to true on write operations.

Yields:

An open file object

Return type:

open_gz(*subkeys, name, mode='rt', open_kwargs=None, ensure_exists=False)[source]

Open a gzipped file that exists already.

Parameters:

subkeys (str) – A sequence of additional strings to join. If none are given, returns the directory for this module.
name (str) – The name of the file to open
mode (str) – The read mode, passed to gzip.open()
open_kwargs (Optional[Mapping[str, Any]]) – Additional keyword arguments passed to gzip.open()
ensure_exists (bool) – Should the file be made? Set to true on write operations.

Yields:

An open file object

Return type: