apeye.url

pathlib-like approach to URLs.

Changed in version 1.0.0: SlumberURL and RequestsURL moved to apeye.slumber_url and apeye.requests_url respectively.

Classes:

Domain(​subdomain, domain, suffix)

typing.NamedTuple of a URL’s subdomain, domain, and suffix.

URL(​[url])

pathlib-like class for URLs.

URLPath(​*args)

Represents the path part of a URL.

Data:

URLPathType

Invariant TypeVar bound to apeye.url.URLPath.

URLType

Invariant TypeVar bound to apeye.url.URL.

URLType = TypeVar(URLType, bound=URL)

Type:    TypeVar

Invariant TypeVar bound to apeye.url.URL.

URLPathType = TypeVar(URLPathType, bound=URLPath)

Type:    TypeVar

Invariant TypeVar bound to apeye.url.URLPath.

New in version 1.1.0.

class URL(url='')[source]

Bases: PathLike

pathlib-like class for URLs.

Parameters

url (Union[str, URL]) – The URL to construct the URL object from. Default ''.

Changed in version 0.3.0: The url parameter can now be a string or a URL.

Changed in version 1.1.0: Added support for sorting and rich comparisons (<, <=, > and >=).

Methods:

__eq__(​other)

Return self == other.

__fspath__(​)

Returns the file system path representation of the URL.

__ge__(​other)

Return self>=value.

__gt__(​other)

Return self>value.

__le__(​other)

Return self<=value.

__lt__(​other)

Return self<value.

__repr__(​)

Returns the string representation of the URL.

__str__(​)

Returns the URL as a string.

__truediv__(​key)

Construct a new URL object for the given child of this URL.

from_parts(​scheme, netloc, path[, query, …])

Construct a URL from a scheme, netloc and path.

joinurl(​*args)

Construct a new URL object by combining the given arguments with this instance’s path part.

relative_to(​other)

Returns a version of this URL’s path relative to other.

strict_compare(​other)

Return self other, comparing the scheme, netloc, path, fragment and query parameters.

with_name(​name[, inherit])

Return a new URL with the file name changed.

with_suffix(​suffix[, inherit])

Returns a new URL with the file suffix changed.

Attributes:

base_url

Returns a apeye.url.URL object representing the URL without query strings or URL fragments.

domain

Returns a apeye.url.Domain object representing the domain part of the URL.

fqdn

Returns the Fully Qualified Domain Name of the URL .

fragment

The URL fragment, used to identify a part of the document.

name

The final path component, if any.

netloc

Network location part of the URL

parent

The logical parent of the URL.

parents

An immutable sequence providing access to the logical ancestors of the URL.

parts

An object providing sequence-like access to the components in the URL.

path

The hierarchical path of the URL

port

The port of number of the URL as an integer, if present.

query

The query parameters of the URL, if present.

scheme

URL scheme specifier

stem

The final path component, minus its last suffix.

suffix

The final component’s last suffix, if any.

suffixes

A list of the final component’s suffixes, if any.

__eq__(other)[source]

Return self == other.

Attention

URL fragments and query parameters are not compared.

See also

URL.strict_compare(), which does consider those attributes.

Return type

bool

__fspath__()[source]

Returns the file system path representation of the URL.

This is comprised of the netloc and path attributes.

Return type

str

__repr__()[source]

Returns the string representation of the URL.

Return type

str

__str__()[source]

Returns the URL as a string.

Return type

str

__truediv__(key)[source]

Construct a new URL object for the given child of this URL.

Return type

~URLType

Changed in version 0.7.0:
  • Added support for division by integers.

  • Now officially supports the new path having a URL fragment and/or query parameters. Any URL fragment or query parameters from the parent URL are not inherited by its children.

property base_url

Returns a apeye.url.URL object representing the URL without query strings or URL fragments.

New in version 0.7.0.

Return type

~URLType

property domain

Returns a apeye.url.Domain object representing the domain part of the URL.

Return type

Domain

property fqdn

Returns the Fully Qualified Domain Name of the URL .

Return type

str

fragment

Type:    Optional[str]

The URL fragment, used to identify a part of the document. None if absent from the URL.

New in version 0.7.0.

classmethod from_parts(scheme, netloc, path, query=None, fragment=None)[source]

Construct a URL from a scheme, netloc and path.

Parameters
  • scheme (str) – The scheme of the URL, e.g 'http'.

  • netloc (str) – The netloc of the URl, e.g. 'bbc.co.uk:80'.

  • path (Union[str, Path, PathLike]) – The path of the URL, e.g. '/news'.

  • query (Optional[Mapping[Any, List]]) – The query parameters of the URL, if present. Default None.

  • fragment (Optional[str]) – The URL fragment, used to identify a part of the document. None if absent from the URL. Default None.

Put together, the resulting path would be 'http://bbc.co.uk:80/news'

Return type

~URLType

Changed in version 0.7.0: Added the query and fragment arguments.

joinurl(*args)[source]

Construct a new URL object by combining the given arguments with this instance’s path part.

New in version 1.1.0.

Except for the final path element any queries and fragments are ignored.

Return type

~URLType

Returns

A new URL representing either a subpath (if all arguments are relative paths) or a totally different path (if one of the arguments is absolute).

property name

The final path component, if any.

Return type

str

netloc

Type:    str

Network location part of the URL

property parent

The logical parent of the URL.

Return type

~URLType

property parents

An immutable sequence providing access to the logical ancestors of the URL.

Return type

Tuple[~URLType, …]

property parts

An object providing sequence-like access to the components in the URL.

To retrieve only the parts of the path, use URL.path.parts.

Return type

Tuple[str, …]

path

Type:    URLPath

The hierarchical path of the URL

property port

The port of number of the URL as an integer, if present. Default None.

New in version 0.7.0.

Return type

Optional[int]

query

Type:    Dict[str, List[str]]

The query parameters of the URL, if present.

New in version 0.7.0.

relative_to(other)[source]

Returns a version of this URL’s path relative to other.

New in version 1.1.0.

Parameters

other (Union[str, URL, URLPath]) – Either a URL, or a string or URLPath representing an absolute path. If a URL, the netloc must match this URL’s.

Raises

ValueError – if the operation is not possible (i.e. because this URL’s path is not a subpath of the other path)

Return type

URLPath

scheme

Type:    str

URL scheme specifier

property stem

The final path component, minus its last suffix.

strict_compare(other)[source]

Return self other, comparing the scheme, netloc, path, fragment and query parameters.

New in version 0.7.0.

Return type

bool

property suffix

The final component’s last suffix, if any.

This includes the leading period. For example: '.txt'.

Return type

str

property suffixes

A list of the final component’s suffixes, if any.

These include the leading periods. For example: ['.tar', '.gz'].

Return type

List[str]

with_name(name, inherit=True)[source]

Return a new URL with the file name changed.

Parameters
  • name (str)

  • inherit (bool) – Whether the new URL should inherit the query string and fragment from this URL. Default True.

Return type

~URLType

Changed in version 0.7.0: Added the inherit parameter.

with_suffix(suffix, inherit=True)[source]

Returns a new URL with the file suffix changed.

If the URL has no suffix, add the given suffix.

If the given suffix is an empty string, remove the suffix from the URL.

Parameters
  • suffix (str)

  • inherit (bool) – Whether the new URL should inherit the query string and fragment from this URL. Default True.

Return type

~URLType

Changed in version 0.7.0: Added the inherit parameter.

class URLPath(*args)[source]

Bases: PurePosixPath

Represents the path part of a URL.

Subclass of pathlib.PurePosixPath that provides a subset of its methods.

Changed in version 1.1.0: Implemented is_absolute(), joinpath(), relative_to(), match(), anchor, drive, and support for rich comparisons (<, <=, > and >=), which previously raised NotImplementedError.

Methods:

__bytes__(​)

Return the bytes representation of the path.

__eq__(​other)

Return self == other.

__repr__(​)

Return a string representation of the URLPath.

__rtruediv__(​key)

Return value / self.

__str__(​)

Return the string representation of the path, suitable for passing to system calls.

__truediv__(​key)

Return self / value.

is_absolute(​)

Returns whether the path is absolute (i.e.

is_reserved(​)

Return True if the path contains one of the special names reserved by the system, if any.

joinpath(​*args)

Combine this URLPath with one or several arguments.

relative_to(​*other)

Returns the relative path to another path identified by the passed arguments.

with_name(​name)

Return a new path with the file name changed.

with_suffix(​suffix)

Return a new path with the file suffix changed.

Attributes:

name

The final path component, if any.

parent

The logical parent of the path.

parents

A sequence of this path’s logical parents.

parts

An object providing sequence-like access to the components in the filesystem path.

root

The root of the path, if any.

stem

The final path component, minus its last suffix.

suffix

The final component’s last suffix, if any.

suffixes

A list of the final component’s suffixes, if any.

__bytes__()

Return the bytes representation of the path. This is only recommended to use under Unix.

__eq__(other)

Return self == other.

Return type

bool

__repr__()[source]

Return a string representation of the URLPath.

Return type

str

__rtruediv__(key)

Return value / self.

__str__()[source]

Return the string representation of the path, suitable for passing to system calls.

Return type

str

__truediv__(key)

Return self / value.

is_absolute()[source]

Returns whether the path is absolute (i.e. starts with /).

New in version 1.1.0: previously raised NotImplementedError.

Return type

bool

is_reserved()

Return True if the path contains one of the special names reserved by the system, if any.

joinpath(*args)[source]

Combine this URLPath with one or several arguments.

New in version 1.1.0: previously raised NotImplementedError.

Return type

~URLPathType

Returns

A new URLPath representing either a subpath (if all arguments are relative paths) or a totally different path (if one of the arguments is absolute).

property name

The final path component, if any.

property parent

The logical parent of the path.

property parents

A sequence of this path’s logical parents.

property parts

An object providing sequence-like access to the components in the filesystem path.

relative_to(*other)[source]

Returns the relative path to another path identified by the passed arguments.

The arguments are joined together to form a single path, and therefore the following behave identically:

>>> URLPath("/news/sport").relative_to("/", "news")
URLPath('sport')
>>> URLPath("/news/sport").relative_to("/news")
URLPath('sport')

New in version 1.1.0: previously raised NotImplementedError.

Parameters

*other

Raises

ValueError – if the operation is not possible (because this is not a subpath of the other path)

See also

relative_to(), which is recommended when constructing a relative path from a URL. This method cannot correctly handle some cases, such as:

>>> URL("https://github.com/domdfcoding").path.relative_to(URL("https://github.com").path)
Traceback (most recent call last):
ValueError: '/domdfcoding' does not start with ''

Since URL("https://github.com").path is URLPath('').

Instead, use:

>>> URL("https://github.com/domdfcoding").relative_to(URL("https://github.com"))
URLPath('domdfcoding')
Return type

~URLPathType

property root

The root of the path, if any.

property stem

The final path component, minus its last suffix.

property suffix

The final component’s last suffix, if any.

This includes the leading period. For example: ‘.txt’

property suffixes

A list of the final component’s suffixes, if any.

These include the leading periods. For example: [‘.tar’, ‘.gz’]

with_name(name)

Return a new path with the file name changed.

with_suffix(suffix)

Return a new path with the file suffix changed. If the path has no suffix, add given suffix. If the given suffix is an empty string, remove the suffix from the path.

namedtuple Domain(subdomain, domain, suffix)[source]

typing.NamedTuple of a URL’s subdomain, domain, and suffix.

Fields
  1.  subdomain (str) – Alias for field number 0

  2.  domain (str) – Alias for field number 1

  3.  suffix (str) – Alias for field number 2

__repr__()[source]

Return a string representation of the Domain.

Return type

str

property fqdn

Returns a Fully Qualified Domain Name, if there is a proper domain/suffix.

>>> URL('https://forums.bbc.co.uk/path/to/file').domain.fqdn
'forums.bbc.co.uk'
>>> URL('https://localhost:8080').domain.fqdn
''
property ipv4

Returns the ipv4 if that is what the presented domain/url is.

>>> URL('https://127.0.0.1/path/to/file').domain.ipv4
IPv4Address('127.0.0.1')
>>> URL('https://127.0.0.1.1/path/to/file').domain.ipv4
>>> URL('https://256.1.1.1').domain.ipv4
Return type

Optional[IPv4Address]

property registered_domain

Joins the domain and suffix fields with a dot, if they’re both set.

>>> URL('https://forums.bbc.co.uk').domain.registered_domain
'bbc.co.uk'
>>> URL('https://localhost:8080').domain.registered_domain
''