Commit graph

17 commits

Author SHA1 Message Date
Shannon Booth
8e37cd2f71 LibURL: Remove URL's valid state
Some checks are pending
CI / Lagom (arm64, Sanitizer_CI, false, macos-15, macOS, Clang) (push) Waiting to run
CI / Lagom (x86_64, Fuzzers_CI, false, ubuntu-24.04, Linux, Clang) (push) Waiting to run
CI / Lagom (x86_64, Sanitizer_CI, false, ubuntu-24.04, Linux, GNU) (push) Waiting to run
CI / Lagom (x86_64, Sanitizer_CI, true, ubuntu-24.04, Linux, Clang) (push) Waiting to run
Package the js repl as a binary artifact / build-and-package (arm64, macos-15, macOS, macOS-universal2) (push) Waiting to run
Package the js repl as a binary artifact / build-and-package (x86_64, ubuntu-24.04, Linux, Linux-x86_64) (push) Waiting to run
Run test262 and test-wasm / run_and_update_results (push) Waiting to run
Lint Code / lint (push) Waiting to run
Label PRs with merge conflicts / auto-labeler (push) Waiting to run
Push notes / build (push) Waiting to run
No code now relies on using URL's valid state.

A URL can still be _technically_ invalid through use of the URL
constructor or by directly changing URL fields.

However, all URLs should be constructed through the URL parser,
and we should ideally be getting rid of the default constructor
at some stage.

Also, any code which is manually setting URL fields need to be
aware that this is full of pitfalls since there are many different
forms of canonicalization which is bypassed by not going through
the URL parser.
2025-04-19 07:18:43 -04:00
Timothy Flynn
0a256b0a9a AK+Everywhere: Change StringView case conversions to return String
There's a bit of a UTF-8 assumption with this change. But nearly every
caller of these methods were immediately creating a String from the
resulting ByteString anyways.
2025-04-07 17:44:38 +02:00
Shannon Booth
4b6f0ee24a LibURL: Do not trim whitespace parsing port in URL parser
This has no functional difference as far as I can tell, but for
clarity explicitly do not attempt to do this, which has the nice
side effect of not checking for whitespace known to not exist.
2025-04-07 10:29:09 -04:00
Timothy Flynn
ee6b2db009 AK+LibURL+LibWeb: Use simdutf to validate ASCII strings
simdutf provides a vectorized ASCII validator, so let's use that instead
of looping over strings manually.
2025-04-06 11:05:58 -04:00
Shannon Booth
a9777a3300 LibURL: Make port state override return failure more for URLPattern
Corresponds to URL spec change:

https://github.com/whatwg/url/commit/cc8b776b

Note that the new test failure being introduced here is an unrelated
WPT test change bundled in the resources test file update that I am
not convinced is correct.
2025-04-06 08:24:54 -04:00
Shannon Booth
4e8f2e48c4 LibURL: Report all hostname state failures for URLPattern
Corresponds to URL spec change:

https://github.com/whatwg/url/commit/c23aec1
2025-04-06 08:24:54 -04:00
Shannon Booth
3f73cd30a2 LibURL: Rename 'cannot have a base URL' to 'has an opaque path'
This follows a rename made in the URL specification.
2025-04-06 08:24:54 -04:00
Shannon Booth
ec3c545426 LibURL+LibWeb: Ensure opaque paths always roundtrip
Corresponds to: https://github.com/whatwg/url/commit/6c782003
2025-03-18 12:17:19 +00:00
Shannon Booth
f775ee8a93 LibURL: Rename 'cannot be a base URL' state to 'opaque path' state
This follows a rename made in the URL specification.
2025-03-15 07:39:03 -04:00
zoupingshi
b609d8481a LibURL+LibWeb+Tests: Remove redundant words 2025-02-27 10:35:39 +00:00
Shannon Booth
5bed8f4055 LibURL+LibWeb: Make URL::basic_parse return an Optional<URL>
URL::basic_parse has a subtle bug where the resulting URL is not set
to valid when StateOveride is provided and the URL parser early returns
a valid URL.

This has not surfaced as a problem so far, as the only users of the
state override API provide an already valid URL buffer and also ignore
the result of basic parsing with a state override.

However, this bug surfaces implementing the URL pattern spec, which as
part of URL canonicalization:
 * Provides a dummy URL record
 * Basic URL parses that URL with state override
 * Checks the result of the URL parser to validate the URL

While we could set URL validity on every early return of the URL parser
during state override, it has been a long standing FIXME around the code
to try and remove the awkward validity state of the URL class. So this
commit makes the first stage of this change by migrating the basic
parser API to return Optional, which also happens to make this subtle
issue not a problem any more.
2025-01-11 10:08:29 -05:00
Shannon Booth
87c8ae31d3 LibURL: Set IDNA's IgnoreInvalidPunycode to false
See: https://github.com/whatwg/url/commit/a6e449 - which should have no
functional change.
2024-12-05 17:29:49 +01:00
Shannon Booth
5dfb825c5c LibURL: Set IDNA's CheckHyphens to the value of beStrict
See: https://github.com/whatwg/url/commit/cd8f1d
2024-12-05 17:29:49 +01:00
Jonne Ransijn
d7596a0a61 AK: Don't implicitly convert Optional<T&> to Optional<T>
C++ will jovially select the implicit conversion operator, even if it's
complete bogus, such as for unknown-size types or non-destructible
types. Therefore, all such conversions (which incur a copy) must
(unfortunately) be explicit so that non-copyable types continue to work.

NOTE: We make an exception for trivially copyable types, since they
are, well, trivially copyable.

Co-authored-by: kleines Filmröllchen <filmroellchen@serenityos.org>
2024-12-04 01:58:22 +01:00
Sam Atkins
63688148b9 LibURL: Promote Host to a proper class
This lets us move a few Host-related functions (like serialization and
checks for what the Host is) into Host instead of having them dotted
around the codebase.

For now, the interface is still very Variant-like, to avoid having to
change quite so much in one go.
2024-11-30 12:07:39 +01:00
Sam Atkins
90e763de4c LibURL: Replace Host's Empty state with making Url's Host optional
A couple of reasons:
- Origin's Host (when in the tuple state) can't be null
- There's an "empty host" concept in the spec which is NOT the same as a
  null Host, and that was confusing me.
2024-11-30 12:07:39 +01:00
Timothy Flynn
93712b24bf Everywhere: Hoist the Libraries folder to the top-level 2024-11-10 12:50:45 +01:00
Renamed from Userland/Libraries/LibURL/Parser.cpp (Browse further)