We were using the public suffix of the URL's host as its registrable
domain. But the registrable domain is actually the public suffix plus
one additional label.
All URLs are now either constucted through the URL Parser or by
default constructing a URL, and setting each of the fields of that
URL manually. This makes it much more difficult to create invalid
URLs.
Currently we create URLs such as 'about:blank' through the StringView
or ByteString constructor of URL. However, in order to elimate the
use of URL::is_valid, we need to get rid of these constructors as it
makes it way too easy to create an invalid URL.
It is very cumbersome to construct an 'about:blank' URL when using
URL::Parser::basic_parse. So instead of doing that, create some
helper functions which will create the 'about:XXX' URLs with the
correct properties set.
Conveniently, this is also a much faster way of creating these URLs
as it means we do not need to parse the URL and can set all of the
members up front.
Instead of just putting in members directly, wrap them up in structs
which represent what a URL blob entry is meant to hold per the spec.
This makes more obvious what this is meant to represent, such as the
ByteBuffer being used to represent the bytes behind a Blob.
This also allows us to use a stronger type for a function that needs
to return a Blob URL entry's object.
URL::basic_parse has a subtle bug where the resulting URL is not set
to valid when StateOveride is provided and the URL parser early returns
a valid URL.
This has not surfaced as a problem so far, as the only users of the
state override API provide an already valid URL buffer and also ignore
the result of basic parsing with a state override.
However, this bug surfaces implementing the URL pattern spec, which as
part of URL canonicalization:
* Provides a dummy URL record
* Basic URL parses that URL with state override
* Checks the result of the URL parser to validate the URL
While we could set URL validity on every early return of the URL parser
during state override, it has been a long standing FIXME around the code
to try and remove the awkward validity state of the URL class. So this
commit makes the first stage of this change by migrating the basic
parser API to return Optional, which also happens to make this subtle
issue not a problem any more.
This lets us move a few Host-related functions (like serialization and
checks for what the Host is) into Host instead of having them dotted
around the codebase.
For now, the interface is still very Variant-like, to avoid having to
change quite so much in one go.
A couple of reasons:
- Origin's Host (when in the tuple state) can't be null
- There's an "empty host" concept in the spec which is NOT the same as a
null Host, and that was confusing me.