`<syntax>` is a limited subset of the "value definition syntax" used in
CSS specs. It's used for `@property`'s `syntax` descriptor, and for the
`type()` function in `attr()`.
Now that headless mode is built into the main Ladybird executable, the
headless-browser's only purpose is to run tests. So let's move it to the
testing directory and rename it to test-web (a la test-js / test-wasm).
Start work on a speculative HTML Parser in Swift. This component will
walk ahead of the normal HTML parser looking for fetch() requests to
make while the normal parser is blocked. This work exposed many holes in
the Swift C++ interop component, which have been reported upstream.
When the TokenStream code was originally written, there was no such
concept in the CSS Syntax spec. But since then, it's been officially
added, (https://drafts.csswg.org/css-syntax/#css-token-stream) and the
parsing algorithms are described in terms of it. This patch brings our
implementation in line with the spec. A few deprecated TokenStream
methods are left around until their users are also updated to match the
newer spec.
There are a few differences:
- They name things differently. The main confusing one is we had
`next_token()` which consumed a token and returned it, but the spec
has a `next_token()` which peeks the next token. The spec names are
honestly better than what I'd come up with. (`discard_a_token()` is a
nice addition too!)
- We used to store the index of the token that was just consumed, and
they instead store the index of the token that will be consumed next.
This is a perfect breeding ground for off-by-one errors, so I've
finally added a test suite for TokenStream itself.
- We use a transaction system for rewinding, and the spec uses a stack
of "marks", which can be manually rewound to. These should be able to
coexist as long as we stick with marks in the parser spec algorithms,
and stick with transactions elsewhere.
Just to sanity check that we can import the library, and that it at
least interprets the generated enumeration values properly, let's
do some simple testing of the swift integration.
We were off-by-one when returning the result of parsing a quoted string
in Web::Fetch::Infrastructure::collect_an_http_quoted_string. Instead of
backtracking the lexer and consuming the backtracked string, do a simple
substring operation.
This is a fetching AO and is only used by LibWeb in the context of fetch
tasks. Move it to LibWeb with other fetch methods.
The main reason for this is that it requires the use of other LibWeb AOs
such as the forgiving Base64 decoder and MIME sniffing. These AOs aren't
available within LibURL.
We have code inside LibWeb that uses the
`AK::StringUtils::convert_to_uint`and `AK::StringUtils::convert_to_int`
methods for parsing integers. This works well for the most part, but
according to the spec, trailing characters are allowed and should be
ignored, but this is not how the `StringUtil` methods are implemented.
This patch adds two new methods named `parse_integer` and
`parse_non_negative_integer` inside the `Web::HTML` namespace that uses
`StringUtils` under the hood but adds a bit more logic to make it spec
compliant.
After the CSSPixels implementation evolved from a wrapper of double
to a fixed-point saturated math arithmetic implementation, it makes
sense to have separate tests for it.
Using a file(GLOB) to find all the test files in a directory is an easy
hack to get things started, but has some drawbacks. Namely, if you add
a test, it won't be found again without re-running CMake. `ninja` seems
to do this automatically, but it would be nice to one day stop seeing it
rechecking our globbed directories.
The test suite includes a few basic tests and a very crude regression
test, which just concatenates the to_string() of all tokens and checks
the String's hash to be equal. This relies on the format of
HTMLToken::to_string() to stay the same, which is not ideal.