ensure_csv

ensure_csv(key, *subkeys, url, name=None, force=False, download_kwargs=None, read_csv_kwargs=None)[source]

Download a CSV and open as a dataframe with pandas.

Parameters:
  • key (str) – The module name

  • subkeys (str) – A sequence of additional strings to join. If none are given, returns the directory for this module.

  • url (str) – The URL to download.

  • name (Optional[str]) – Overrides the name of the file at the end of the URL, if given. Also useful for URLs that don’t have proper filenames with extensions.

  • force (bool) – Should the download be done again, even if the path already exists? Defaults to false.

  • download_kwargs (Optional[Mapping[str, Any]]) – Keyword arguments to pass through to pystow.utils.download().

  • read_csv_kwargs (Optional[Mapping[str, Any]]) –

    Keyword arguments to pass through to pandas.read_csv().

    Note

    It is assumed that the CSV uses tab separators, as this is the only safe option. For more information, see Wikipedia and Issue #51. To override this behavior and load using the comma separator, specify read_csv_kwargs=dict(sep=",").

Return type:

DataFrame

Returns:

A pandas DataFrame

Example usage:

>>> import pystow
>>> import pandas as pd
>>> url = 'https://raw.githubusercontent.com/pykeen/pykeen/master/src/pykeen/datasets/nations/test.txt'
>>> df: pd.DataFrame = pystow.ensure_csv('pykeen', 'datasets', 'nations', url=url)