Skip to content

urllib.request.url2pathname() mishandles empty authority sections (mostly) #126766

Closed
@barneygale

Description

@barneygale

Bug report

Bug description:

File URIs that start with 3+ slashes should be parsed as having an empty authority section (ref), but urllib.request.url2pathname() incorrectly retains the slashes introducing the authority section. This means it can't properly parse the most common form of POSIX absolute file URIs (e.g. file:///etc/hosts).

On Windows, url2pathname() correctly discards slashes before DOS drives (so file:///c:/foo is parsed as c:\foo), and before old-fashioned UNC URIs (so file:////server/share is parsed as \\server\share), but incorrectly retains slashes if a rooted, driveless path is decoded (so file:///foo/bar is decoded as \\\foo\bar instead of \foo\bar). This is much less of a problem because such paths are rare on Windows.

>>> from urllib.request import url2pathname >>> url2pathname('///etc/hosts') '///etc/hosts' # expected: '/etc/hosts'

CPython versions tested on:

CPython main branch

Operating systems tested on:

Linux, Windows

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    3.12only security fixes3.13bugs and security fixes3.14bugs and security fixesstdlibPython modules in the Lib dirtype-bugAn unexpected behavior, bug, or error

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions