ensure_rdf

ensure_rdf(key: str, *subkeys: str, url: str, name: str | None = None, force: bool = False, download_kwargs: DownloadKwargs | None = None, precache: bool = True, parse_kwargs: Mapping[str, Any] | None = None) rdflib.Graph[source]

Download a RDF file and open with rdflib.

Parameters:
  • key – The module name

  • subkeys – A sequence of additional strings to join. If none are given, returns the directory for this module.

  • url – The URL to download.

  • name – Overrides the name of the file at the end of the URL, if given. Also useful for URLs that don’t have proper filenames with extensions.

  • force – Should the download be done again, even if the path already exists? Defaults to false.

  • download_kwargs – Keyword arguments to pass through to pystow.utils.download().

  • precache – Should the parsed rdflib.Graph be stored as a pickle for fast loading?

  • parse_kwargs – Keyword arguments to pass through to pystow.utils.read_rdf() and transitively to rdflib.Graph.parse().

Returns:

An RDF graph

Example usage

import pystow
import rdflib

url = "https://ftp.expasy.org/databases/rhea/rdf/rhea.rdf.gz"
rdf_graph: rdflib.Graph = pystow.ensure_rdf("rhea", url=url, parse_kwargs={"format": "xml"})

Note

Sometimes, rdflib is able to guess the format, and you can omit the “format” from the parse_kwargs argument.

Here’s another example

import pystow
import rdflib

url = "http://oaei.webdatacommons.org/tdrs/testdata/persistent/knowledgegraph/v3/suite/memoryalpha-stexpanded/component/reference.xml"
rdf_graph: rdflib.Graph = pystow.ensure_rdf(
    "memoryalpha-stexpanded", url=url, parse_kwargs={"format": "xml"}
)