ensure_rdf

ensure_rdf(key, *subkeys, url, name=None, force=False, download_kwargs=None, precache=True, parse_kwargs=None)[source]

Download a RDF file and open with rdflib.

Parameters:
  • key (str) – The module name

  • subkeys (str) – A sequence of additional strings to join. If none are given, returns the directory for this module.

  • url (str) – The URL to download.

  • name (Optional[str]) – Overrides the name of the file at the end of the URL, if given. Also useful for URLs that don’t have proper filenames with extensions.

  • force (bool) – Should the download be done again, even if the path already exists? Defaults to false.

  • download_kwargs (Optional[Mapping[str, Any]]) – Keyword arguments to pass through to pystow.utils.download().

  • precache (bool) – Should the parsed rdflib.Graph be stored as a pickle for fast loading?

  • parse_kwargs (Optional[Mapping[str, Any]]) – Keyword arguments to pass through to pystow.utils.read_rdf() and transitively to rdflib.Graph.parse().

Return type:

Graph

Returns:

An RDF graph

Example usage:

>>> import pystow
>>> import rdflib
>>> url = 'https://ftp.expasy.org/databases/rhea/rdf/rhea.rdf.gz'
>>> rdf_graph: rdflib.Graph = pystow.ensure_rdf('rhea', url=url)

If rdflib fails to guess the format, you can explicitly specify it using the parse_kwargs argument:

>>> import pystow
>>> import rdflib
>>> url = "http://oaei.webdatacommons.org/tdrs/testdata/persistent/knowledgegraph"     ... "/v3/suite/memoryalpha-stexpanded/component/reference.xml"
>>> rdf_graph: rdflib.Graph = pystow.ensure_rdf("memoryalpha-stexpanded", url=url, parse_kwargs={"format": "xml"})