ladybird

mirror of https://github.com/LadybirdBrowser/ladybird.git synced 2025-06-04 01:12:56 +00:00

Author	SHA1	Message	Date
Timothy Flynn	0a256b0a9a	AK+Everywhere: Change StringView case conversions to return String There's a bit of a UTF-8 assumption with this change. But nearly every caller of these methods were immediately creating a String from the resulting ByteString anyways.	2025-04-07 17:44:38 +02:00
Shannon Booth	0a58497ab9	LibURL/Pattern: Fix PatternParser logic for prefix codepoint comparison We were not properly handling the case that prefix code point was the empty string (which we represent as an OptionalNone). While this still resulted in the correct pattern string being generated, an incorrect regular expression was being generated causing matching to fail.	2025-04-07 10:29:09 -04:00
Shannon Booth	4b6f0ee24a	LibURL: Do not trim whitespace parsing port in URL parser This has no functional difference as far as I can tell, but for clarity explicitly do not attempt to do this, which has the nice side effect of not checking for whitespace known to not exist.	2025-04-07 10:29:09 -04:00
Shannon Booth	565ccc04a9	LibURL/Pattern: Do not trim whitespace interpreting port It turns out that the problem here was simply that we were trimming trailing whitespace when we did not need to, which was meaning that the port number of '80 ' was being converted to the empty string per URLPattern elision as the port matches the http scheme.	2025-04-07 10:29:09 -04:00
Timothy Flynn	ee6b2db009	AK+LibURL+LibWeb: Use simdutf to validate ASCII strings simdutf provides a vectorized ASCII validator, so let's use that instead of looping over strings manually.	2025-04-06 11:05:58 -04:00
Shannon Booth	212095e1c2	LibURL/Pattern: Ensure string passed through in process a URLPatternInit Some checks are pending CI / Lagom (arm64, Sanitizer_CI, false, macos-15, macOS, Clang) (push) Waiting to run Details CI / Lagom (x86_64, Fuzzers_CI, false, ubuntu-24.04, Linux, Clang) (push) Waiting to run Details CI / Lagom (x86_64, Sanitizer_CI, false, ubuntu-24.04, Linux, GNU) (push) Waiting to run Details CI / Lagom (x86_64, Sanitizer_CI, true, ubuntu-24.04, Linux, Clang) (push) Waiting to run Details Package the js repl as a binary artifact / build-and-package (arm64, macos-15, macOS, macOS-universal2) (push) Waiting to run Details Package the js repl as a binary artifact / build-and-package (x86_64, ubuntu-24.04, Linux, Linux-x86_64) (push) Waiting to run Details Run test262 and test-wasm / run_and_update_results (push) Waiting to run Details Lint Code / lint (push) Waiting to run Details Label PRs with merge conflicts / auto-labeler (push) Waiting to run Details Push notes / build (push) Waiting to run Details Corresponds to: `696b402`	2025-04-06 08:24:54 -04:00
Shannon Booth	bee3720b6f	LibURL/Pattern: Make dummyURL from the URL parser with a special scheme Corresponds to: `46c30fda8f` Along with a follow up bug fix that I made of: `5e1c93e2` This for example, fixes canonicalization of URL hosts containing special characters that should have the unicode ToAscii algorithm performed on them as the URLs were not being treated as special.	2025-04-06 08:24:54 -04:00
Shannon Booth	83a82a027f	LibURL/Pattern: Do not return errors in some canonicalization steps Corresponds to: `5c979a31`	2025-04-06 08:24:54 -04:00
Shannon Booth	a9777a3300	LibURL: Make port state override return failure more for URLPattern Corresponds to URL spec change: `cc8b776b` Note that the new test failure being introduced here is an unrelated WPT test change bundled in the resources test file update that I am not convinced is correct.	2025-04-06 08:24:54 -04:00
Shannon Booth	4e8f2e48c4	LibURL: Report all hostname state failures for URLPattern Corresponds to URL spec change: `c23aec1`	2025-04-06 08:24:54 -04:00
Shannon Booth	dcb7842f59	LibURL/Pattern: Use opaque pathname serialization in canonicalization The URL spec represents its path as a: Variant<String, Vector<String>> A URL is defined has having an opaque path if it has a single String, the URL path otherwise containing a list of path components. We (like in an older version of the spec) track this through a boolean and only use a Vector with a single component for opaque paths. This means it was incorrect to simple assign the path to a list with a single empty string without setting that URL as opaque, which meant that the path serialization was producing incorrect results. It may make sense changing the API so this situation is a little more clear. But for now, we simply need to set the opaque path boolean to true here.	2025-04-06 08:24:54 -04:00
Shannon Booth	e7ad9a9bad	LibURL/Pattern: URL parse correct value in opaque path canonicalization	2025-04-06 08:24:54 -04:00
Shannon Booth	e54504ad93	LibURL/Pattern: Implement 'compute protocol matches a special scheme'	2025-04-06 08:24:54 -04:00
Shannon Booth	6b1fa3ecd0	LibURL/Pattern: Implement matching a URLPattern	2025-04-06 08:24:54 -04:00
Shannon Booth	2a44420e52	LibURL/Pattern: Implement generating a component match result	2025-04-06 08:24:54 -04:00
Shannon Booth	e35555f00e	LibURL/Pattern: Complete the implementation of the constructor	2025-04-06 08:24:54 -04:00
Shannon Booth	c9e6ad562c	LibURL/Pattern: Implement ability to compile a component This provides the infrastructure for taking a part list from the pattern parser and generating the actual regexp object which is used for matching against URLs from the pattern.	2025-04-06 08:24:54 -04:00
Shannon Booth	934f1ec30d	LibURL/Pattern: Implement the URLPattern Pattern Parser	2025-04-06 08:24:54 -04:00
Shannon Booth	45d852d14b	LibURL: Add helper for getting array of the special schemes This is useful for iterating over all of the special schemes, as needed in the URLPattern implementation.	2025-04-06 08:24:54 -04:00
Shannon Booth	e3ef6d3aee	LibURL/Pattern: Implement ability to generate a pattern string Compiling a URLPattern component will generate a 'parts list' which is used for generating the regular expression that is used for matching against URLs. This parts list is also used to generate (through this function) a pattern string. The pattern string of a URL component is what is exposed on the USVString getters of the URLPattern class itself. As an example, the following: ``` let pattern = new URLPattern({ "pathname": "/foo/(.)" }); console.log(pattern.pathname); ``` Will log the pattern string of: '/foo/**'.	2025-04-06 08:24:54 -04:00
Shannon Booth	f3679184cb	LibURL/Pattern: Add representation of a URL Pattern 'options' struct These control how a pattern string is generated, which can vary for different components and is also impacted by the 'ignoreCase' option that can be provided in the URLPattern constructor.	2025-04-06 08:24:54 -04:00
Shannon Booth	88bea4a717	LibURL/Pattern: Add a URL Pattern 'Part' representation	2025-04-06 08:24:54 -04:00
Shannon Booth	8a33c57c1e	LibWeb/LibURL: Use an IgnoreCase enum for URLPatternOptions This is to save a future name conflict that will appear between the options IDL dictionary and the options struct that are both present in the spec. It is also a nicer interface for now given there is only a single option at the moment.	2025-04-06 08:24:54 -04:00
Shannon Booth	f80e7d6816	LibURL/Pattern: Implement processing a URL Pattern Init This gets us to the point just before the point of parsing the pattern strings for each URL component to produce a regular expression.	2025-04-06 08:24:54 -04:00
Shannon Booth	3f73cd30a2	LibURL: Rename 'cannot have a base URL' to 'has an opaque path' This follows a rename made in the URL specification.	2025-04-06 08:24:54 -04:00
Shannon Booth	6b85748f53	LibURL/Pattern: Implement helper for escaping a URL Pattern String	2025-04-06 08:24:54 -04:00
Shannon Booth	ec3c545426	LibURL+LibWeb: Ensure opaque paths always roundtrip Corresponds to: `6c782003`	2025-03-18 12:17:19 +00:00
Shannon Booth	a9e20cb6c3	LibURL/Pattern: Use ConstructorStringParser to construct URLPatternInit	2025-03-15 07:39:03 -04:00
Shannon Booth	e369756e9c	LibURL/Pattern: Implement the constructor string parser This is missing one small bit of functionality where the not-yet impplemented component compilation is required.	2025-03-15 07:39:03 -04:00
Shannon Booth	e70272ddef	LibURL/Pattern: Implement URL Pattern canonicalization These are used to normalize URL components.	2025-03-15 07:39:03 -04:00
Shannon Booth	f775ee8a93	LibURL: Rename 'cannot be a base URL' state to 'opaque path' state This follows a rename made in the URL specification.	2025-03-15 07:39:03 -04:00
Shannon Booth	f8f21319f9	LibURL/Pattern: Implement the URL Pattern Tokenizer The tokenizer is used for both pattern string and constructor string parsing of URL Patterns.	2025-03-15 07:39:03 -04:00
Timothy Flynn	a34f7a5bd1	LibURL: Correctly acquire the registrable domain for a URL We were using the public suffix of the URL's host as its registrable domain. But the registrable domain is actually the public suffix plus one additional label.	2025-03-11 12:10:42 +01:00
Vishal Biswas	90b303215e	LibURL: Add U+005E to path percent encoding list Passes wpt tests which were failing after `9bc33c39d4`. It also removes ^ from Userinfo set as its included in Path set now	2025-03-10 11:19:36 +01:00
Shannon Booth	10b32a8dd8	LibURL/Pattern: Stub out URL::Pattern::match This will allow us to complete the IDL interface, which will leave remaining work to implement the URL pattern specification within LibURL.	2025-03-04 16:32:09 -05:00
Shannon Booth	ff07cc1a6c	LibURL/Pattern: Add some scaffolding for the URLPattern constructor	2025-03-04 16:32:09 -05:00
Shannon Booth	873f7e4b3d	LibURL/Pattern: Add a representation of a URL Pattern error As the comment in this file explains the caller of LibURL APIs are meant to assume if they see any error, that it is a TypeError since that is all the spec throws at the moment. A custom error type exists here so that we can include more information in TypeError's which are thrown.	2025-03-04 16:32:09 -05:00
Shannon Booth	de89f5af6d	LibURL: Remove the implicit URL constructors All URLs are now either constucted through the URL Parser or by default constructing a URL, and setting each of the fields of that URL manually. This makes it much more difficult to create invalid URLs.	2025-03-04 16:24:19 -05:00
zoupingshi	b609d8481a	LibURL+LibWeb+Tests: Remove redundant words	2025-02-27 10:35:39 +00:00
Shannon Booth	d62cf0a807	Everywhere: Remove some use of the URL constructors These make it too easy to construct an invalid URL, which makes it difficult to remove the valid state of URL - which this API relies on.	2025-02-19 08:01:35 -05:00
Shannon Booth	f3662c6f88	LibURL/Pattern: Add a representation of a URL Pattern This is the core object behind a URL pattern which when constructed can be used for matching the pattern against URLs. However, the implementation here is missing key functions such as the constructor and the 'test'/'exec' functions as that relies on a significant amount of supporting URLPattern infrastructure such as two different parsers and a tokenizer. However, this is enough for us to implement some more of the IDL wrapper layer of this specification.	2025-02-17 19:10:39 -05:00
Shannon Booth	5521836929	LibURL/Pattern: Add a representation of a URL Pattern 'component' A URL pattern consists of components such as the 'port', 'password' 'hostname', etc. A component is compiled from the input to the URLPattern constructor and is what is used for matching against URLs to produce a match result. This is also where the regex dependency is introduced into LibURL to support the URLPattern implementation.	2025-02-17 19:10:39 -05:00
Shannon Booth	07f054e067	LibURL: Add 'about:XXX' helper factory functions Currently we create URLs such as 'about:blank' through the StringView or ByteString constructor of URL. However, in order to elimate the use of URL::is_valid, we need to get rid of these constructors as it makes it way too easy to create an invalid URL. It is very cumbersome to construct an 'about:blank' URL when using URL::Parser::basic_parse. So instead of doing that, create some helper functions which will create the 'about:XXX' URLs with the correct properties set. Conveniently, this is also a much faster way of creating these URLs as it means we do not need to parse the URL and can set all of the members up front.	2025-02-15 17:05:55 +00:00
Shannon Booth	53826995f6	LibURL+LibWeb: Port URL::complete_url to Optional Removing one more source of the URL::is_valid API.	2025-02-15 17:05:55 +00:00
Shannon Booth	dc2c62825b	LibURL: Add a representation of a URL Pattern 'result' This is the return value of a URLPattern after `exec` is called on it. It conveys information about the named (or unammed) regex groups matched for each component of the URL. For example, ``` let p = new URLPattern({ hostname: "{:subdomain.}*example.com" }); const result = pattern.exec({ hostname: "foo.bar.example.com" }); console.log(result.hostname.groups.subdomain); ``` Will log 'foo.bar'.	2025-02-10 17:05:15 +00:00
Shannon Booth	46bfced9ad	LibURL: Add representations of URLPattern{Init,Options,Input} The URLPattern spec is intended to be implemented inside of LibURL, with LibWeb only responsible for the IDL conversion layer, in a similar manner to how URL is implemented.	2025-01-27 18:07:17 +00:00
Shannon Booth	ca3d9d9ee0	LibURL+LibWeb+LibIPC: Represent blob URL entry's object using structs Instead of just putting in members directly, wrap them up in structs which represent what a URL blob entry is meant to hold per the spec. This makes more obvious what this is meant to represent, such as the ByteBuffer being used to represent the bytes behind a Blob. This also allows us to use a stronger type for a function that needs to return a Blob URL entry's object.	2025-01-21 19:22:07 +00:00
Sam Atkins	9a7ce901b7	LibURL: Gracefully handle a host having no public suffix Specifically, after implementing some recent spec changes to navigables, we end up calling `get_public_suffix("localhost")` here, which returns OptionalNone. This would previously crash. Our get_public_suffix() seems a little incorrect. From the spec: > If no rules match, the prevailing rule is "*". > https://github.com/publicsuffix/list/wiki/Format#algorithm However, ours returns an empty Optional in that case. To avoid breaking other users of it, this patch modifies Host's uses of it, rather than the function itself.	2025-01-21 18:17:18 +01:00
Shannon Booth	5bed8f4055	LibURL+LibWeb: Make URL::basic_parse return an Optional<URL> URL::basic_parse has a subtle bug where the resulting URL is not set to valid when StateOveride is provided and the URL parser early returns a valid URL. This has not surfaced as a problem so far, as the only users of the state override API provide an already valid URL buffer and also ignore the result of basic parsing with a state override. However, this bug surfaces implementing the URL pattern spec, which as part of URL canonicalization: * Provides a dummy URL record * Basic URL parses that URL with state override * Checks the result of the URL parser to validate the URL While we could set URL validity on every early return of the URL parser during state override, it has been a long standing FIXME around the code to try and remove the awkward validity state of the URL class. So this commit makes the first stage of this change by migrating the basic parser API to return Optional, which also happens to make this subtle issue not a problem any more.	2025-01-11 10:08:29 -05:00
Shannon Booth	87c8ae31d3	LibURL: Set IDNA's IgnoreInvalidPunycode to false See: https://github.com/whatwg/url/commit/a6e449 - which should have no functional change.	2024-12-05 17:29:49 +01:00

1 2

65 commits