No code now relies on using URL's valid state.
A URL can still be _technically_ invalid through use of the URL
constructor or by directly changing URL fields.
However, all URLs should be constructed through the URL parser,
and we should ideally be getting rid of the default constructor
at some stage.
Also, any code which is manually setting URL fields need to be
aware that this is full of pitfalls since there are many different
forms of canonicalization which is bypassed by not going through
the URL parser.
There's a bit of a UTF-8 assumption with this change. But nearly every
caller of these methods were immediately creating a String from the
resulting ByteString anyways.
This has no functional difference as far as I can tell, but for
clarity explicitly do not attempt to do this, which has the nice
side effect of not checking for whitespace known to not exist.
Corresponds to URL spec change:
https://github.com/whatwg/url/commit/cc8b776b
Note that the new test failure being introduced here is an unrelated
WPT test change bundled in the resources test file update that I am
not convinced is correct.
URL::basic_parse has a subtle bug where the resulting URL is not set
to valid when StateOveride is provided and the URL parser early returns
a valid URL.
This has not surfaced as a problem so far, as the only users of the
state override API provide an already valid URL buffer and also ignore
the result of basic parsing with a state override.
However, this bug surfaces implementing the URL pattern spec, which as
part of URL canonicalization:
* Provides a dummy URL record
* Basic URL parses that URL with state override
* Checks the result of the URL parser to validate the URL
While we could set URL validity on every early return of the URL parser
during state override, it has been a long standing FIXME around the code
to try and remove the awkward validity state of the URL class. So this
commit makes the first stage of this change by migrating the basic
parser API to return Optional, which also happens to make this subtle
issue not a problem any more.
C++ will jovially select the implicit conversion operator, even if it's
complete bogus, such as for unknown-size types or non-destructible
types. Therefore, all such conversions (which incur a copy) must
(unfortunately) be explicit so that non-copyable types continue to work.
NOTE: We make an exception for trivially copyable types, since they
are, well, trivially copyable.
Co-authored-by: kleines Filmröllchen <filmroellchen@serenityos.org>
This lets us move a few Host-related functions (like serialization and
checks for what the Host is) into Host instead of having them dotted
around the codebase.
For now, the interface is still very Variant-like, to avoid having to
change quite so much in one go.
A couple of reasons:
- Origin's Host (when in the tuple state) can't be null
- There's an "empty host" concept in the spec which is NOT the same as a
null Host, and that was confusing me.