Skip to content

fix(ext/web): handle Windows file paths in URL parsing#33097

Open
renezander030 wants to merge 2 commits intodenoland:mainfrom
renezander030:fix/30363-windows-path-url
Open

fix(ext/web): handle Windows file paths in URL parsing#33097
renezander030 wants to merge 2 commits intodenoland:mainfrom
renezander030:fix/30363-windows-path-url

Conversation

@renezander030
Copy link
Copy Markdown

Summary

Implements the WHATWG URL spec change (whatwg/url#874) to handle Windows-style file paths in URL parsing.

When new URL("C:\\path\\file.txt") is called, the parser now detects the Windows drive letter pattern (single ASCII alpha + : + \) and converts it to a file:/// URL with normalized forward slashes: file:///C:/path/file.txt.

Changes

  • ext/web/url.rs: Added maybe_convert_windows_path_to_file_url() preprocessing function that detects Windows drive letter patterns before the input reaches rust-url's parser. Called from parse_url() for both op_url_parse and op_url_parse_with_base code paths.
  • tests/unit/url_test.ts: Added tests covering basic paths, different drive letters, lowercase drives, mixed separators, paths with base URL, and URL.parse().

Spec details

The WHATWG URL spec change adds a check in the "scheme start state": when the parser encounters a single ASCII alpha character as a potential scheme, followed by : and \, it recognizes this as a Windows drive path rather than a URL scheme. It then sets scheme to file, host to empty string, and transitions to path state.

Test plan

  • Unit tests added in tests/unit/url_test.ts
  • WPT tests will pass once web-platform-tests/wpt#53459 lands and the test expectations are updated

Fixes #30363

Implement WHATWG URL spec change (url#874) to detect Windows drive
letter patterns (e.g., C:\path\file.txt) in the URL parser's scheme
start state and automatically convert them to file:/// URLs
(file:///C:/path/file.txt).

The spec adds a check: when parsing encounters a single ASCII alpha
letter as the scheme buffer, followed by ':' and '\', it recognizes
this as a Windows drive path rather than a URL scheme. The parser then
sets the scheme to "file", the host to empty string, and transitions
to path state with backslashes normalized to forward slashes.

This is implemented as a preprocessing step in the Rust parse_url
function, before the input reaches the rust-url crate parser.
@Hajime-san
Copy link
Copy Markdown
Contributor

@renezander030
Copy link
Copy Markdown
Author

Thanks for the pointer. I'm aware of that comment. The difference here is that this implements the WHATWG URL spec change (url#874), which is being tracked by Chromium, Gecko, and WebKit as well.

The upstream Rust url crate follows the WHATWG spec. Since the spec PR hasn't merged yet, the crate won't add this behavior until it does. So if Deno wants to ship this ahead of the spec landing, the conversion has to live in Deno for now.

The implementation is intentionally minimal and isolated (one function). Easy to remove once the upstream crate picks it up.

Happy to hear from the maintainers on whether they'd prefer to wait for upstream or ship early.

@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement Windows file path handling in URL parsing (WHATWG URL #874)

3 participants