ensure_soup

ensure_soup(key: str, *subkeys: str, url: str, name: str | None = None, version: VersionHint = None, force: bool = False, download_kwargs: DownloadKwargs | None = None, beautiful_soup_kwargs: Mapping[str, Any] | None = None) bs4.BeautifulSoup[source]

Ensure a webpage is downloaded and parsed with BeautifulSoup.

Parameters:
  • key – The name of the module. No funny characters. The envvar <key>_HOME where key is uppercased is checked first before using the default home directory.

  • subkeys – A sequence of additional strings to join. If none are given, returns the directory for this module.

  • url – The URL to download.

  • name – Overrides the name of the file at the end of the URL, if given. Also useful for URLs that don’t have proper filenames with extensions.

  • force – Should the download be done again, even if the path already exists? Defaults to false.

  • download_kwargs – Keyword arguments to pass through to pystow.utils.download().

  • mode – The read mode, passed to open()

  • open_kwargs – Additional keyword arguments passed to open()

  • beautiful_soup_kwargs – Additional keyword arguments passed to BeautifulSoup

Returns:

An BeautifulSoup object

Note

If you don’t need to cache, consider using pystow.utils.get_soup() instead.