LibURL+LibWeb: Make URL::basic_parse return an Optional<URL>

URL::basic_parse has a subtle bug where the resulting URL is not set
to valid when StateOveride is provided and the URL parser early returns
a valid URL.

This has not surfaced as a problem so far, as the only users of the
state override API provide an already valid URL buffer and also ignore
the result of basic parsing with a state override.

However, this bug surfaces implementing the URL pattern spec, which as
part of URL canonicalization:
 * Provides a dummy URL record
 * Basic URL parses that URL with state override
 * Checks the result of the URL parser to validate the URL

While we could set URL validity on every early return of the URL parser
during state override, it has been a long standing FIXME around the code
to try and remove the awkward validity state of the URL class. So this
commit makes the first stage of this change by migrating the basic
parser API to return Optional, which also happens to make this subtle
issue not a problem any more.
This commit is contained in:
Shannon Booth 2025-01-10 04:50:34 +13:00 committed by Tim Flynn
commit 5bed8f4055
Notes: github-actions[bot] 2025-01-11 15:09:28 +00:00
9 changed files with 56 additions and 56 deletions

View file

@ -23,7 +23,7 @@ namespace URL {
// FIXME: It could make sense to force users of URL to use URL::Parser::basic_parse() explicitly instead of using a constructor.
URL::URL(StringView string)
: URL(Parser::basic_parse(string))
: URL(Parser::basic_parse(string).value_or(URL {}))
{
if constexpr (URL_PARSER_DEBUG) {
if (m_data->valid)
@ -38,7 +38,11 @@ URL URL::complete_url(StringView relative_url) const
if (!is_valid())
return {};
return Parser::basic_parse(relative_url, *this);
auto result = Parser::basic_parse(relative_url, *this);
if (!result.has_value())
return {};
return result.release_value();
}
ByteString URL::path_segment_at_index(size_t index) const
@ -367,12 +371,12 @@ Origin URL::origin() const
auto path_url = Parser::basic_parse(serialize_path());
// 3. If pathURL is failure, then return a new opaque origin.
if (!path_url.is_valid())
if (!path_url.has_value())
return Origin {};
// 4. If pathURLs scheme is "http", "https", or "file", then return pathURLs origin.
if (path_url.scheme().is_one_of("http"sv, "https"sv, "file"sv))
return path_url.origin();
if (path_url->scheme().is_one_of("http"sv, "https"sv, "file"sv))
return path_url->origin();
// 5. Return a new opaque origin.
return Origin {};