Commit graph

37 commits

Author SHA1 Message Date
Idan Horowitz
d6c2893663 LibWeb: Add initial CookieStore support 2025-08-08 13:09:58 -04:00
Jelle Raaijmakers
22bda8e5e2 LibWeb: Stub Geolocation API
Some checks are pending
CI / macOS, arm64, Sanitizer_CI, Clang (push) Waiting to run
CI / Linux, x86_64, Fuzzers_CI, Clang (push) Waiting to run
CI / Linux, x86_64, Sanitizer_CI, GNU (push) Waiting to run
CI / Linux, x86_64, Sanitizer_CI, Clang (push) Waiting to run
Package the js repl as a binary artifact / macOS, arm64 (push) Waiting to run
Package the js repl as a binary artifact / Linux, x86_64 (push) Waiting to run
Run test262 and test-wasm / run_and_update_results (push) Waiting to run
Lint Code / lint (push) Waiting to run
Label PRs with merge conflicts / auto-labeler (push) Waiting to run
Push notes / build (push) Waiting to run
2025-06-21 10:00:29 +02:00
Sam Atkins
ffa1dba96a LibWeb: Generate pseudo-element code from JSON
Initially, this generates the enum and to/from-string functions. The
JSON itself contains more data than that, but it's unused for now.
2025-03-24 09:49:50 +00:00
Ryan Liptak
0b0f47e320 LibWeb: Make named character references more spec-compliant & efficient
There are two changes happening here: a correctness fix, and an
optimization. In theory they are unrelated, but the optimization
actually paves the way for the correctness fix.

Before this commit, the HTML tokenizer would attempt to look for named
character references by checking from after the `&` until the end of
m_decoded_input, which meant that it was unable to recognize things like
named character references that are inserted via `document.write` one
byte at a time. For example, if `∉` was written one-byte-at-a-time
with `document.write`, then the tokenizer would only check against `n`
since that's all that would exist at the time of the check and therefore
erroneously conclude that it was an invalid named character reference.

This commit modifies the approach taken for named character reference
matching by using a trie-like structure (specifically, a deterministic
acyclic finite state automaton or DAFSA), which allows for efficiently
matching one-character-at-a-time and therefore it is able to pick up
matching where it left off after each code point is consumed.

Note: Because it's possible for a partial match to not actually develop
into a full match (e.g. `&notindo` which could lead to `⋵̸`),
some backtracking is performed after-the-fact in order to only consume
the code points within the longest match found (e.g. `&notindo` would
backtrack back to `&not`).

With this new approach, `document.write` being called one-byte-at-a-time
is handled correctly, which allows for passing more WPT tests, with the
most directly relevant tests being
`/html/syntax/parsing/html5lib_entities01.html`
and
`/html/syntax/parsing/html5lib_entities02.html`
when run with `?run_type=write_single`. Additionally, the implementation
now better conforms to the language of the spec (and resolves a FIXME)
because exactly the matched characters are consumed and nothing more, so
SWITCH_TO is able to be used as the spec says instead of RECONSUME_IN.

The new approach is also an optimization:

- Instead of a linear search using `starts_with`, the usage of a DAFSA
  means that it is always aware of which characters can lead to a match
  at any given point, and will bail out whenever a match is no longer
  possible.
- The DAFSA is able to take advantage of the note in the section
  `13.5 Named character references` that says "This list is static and
  will not be expanded or changed in the future." and tailor its Node
  struct accordingly to tightly pack each node's data into 32-bits.
  Together with the inherent DAFSA property of redundant node
  deduplication, the amount of data stored for named character reference
  matching is minimized.

In my testing:

- A benchmark tokenizing an arbitrary set of HTML test files was about
  1.23x faster (2070ms to 1682ms).
- A benchmark tokenizing a file with tens of thousands of named
  character references mixed in with truncated named character
  references and arbitrary ASCII characters/ampersands runs about 8x
  faster (758ms to 93ms).
- The size of `liblagom-web.so` was reduced by 94.96KiB.

Some technical details:

A DAFSA (deterministic acyclic finite state automaton) is essentially a
trie flattened into an array, but it also uses techniques to minimize
redundant nodes. This provides fast lookups while minimizing the
required data size, but normally does not allow for associating data
related to each word. However, by adding a count of the number of
possible words from each node, it becomes possible to also use it to
achieve minimal perfect hashing for the set of words (which allows going
from word -> unique index as well as unique index -> word). This allows
us to store a second array of data so that the DAFSA can be used as a
lookup for e.g. the associated code points.

For the Swift implementation, the new NamedCharacterReferenceMatcher
was used to satisfy the previous API and the tokenizer was left alone
otherwise. In the future, the Swift implementation should be updated to
use the same implementation for its NamedCharacterReference state as
the updated C++ implementation.
2025-03-22 16:03:44 +01:00
Sam Atkins
f148af0a93 LibWeb: Move XMLSerializer into HTML directory
The DOMParsing spec is in the process of being merged into the HTML one,
gradually. The linked spec change moves XMLSerializer, but many of the
algorithms are still in the DOMParsing spec so I've left the links to
those alone.

I've done my best to update the GN build but since I'm not actually
using it, I might have done that wrong.

Corresponds to 2edb8cc7ee
2025-03-04 16:44:41 +00:00
Jelle Raaijmakers
3504370281 LibWeb: Add stubbed Media Source Extensions API
Just the boilerplate :^)
2024-11-01 13:23:45 -04:00
Andrew Kaster
be13a0ec3f Meta: Port changes to gn build
8b3a5d0b96
99ef078c97
dc401f49ea
d30ae92b82
c22a2d8f2b
6cf7d07a98
0c98c7637e
00487a7b25
1db243c006
2024-10-09 06:43:31 -04:00
Andrew Kaster
29416befe6 LibWeb: Add ServiceWorker job registration and execution
Now we can register jobs and they will be executed on the event loop
"later". This doesn't feel like the right place to execute them, but
the spec needs some updates in this regard anyway.
2024-10-04 07:08:08 +02:00
Andrew Kaster
7880b2fba9 Meta: Update LibWeb to build again with gn 2024-09-27 10:15:08 -06:00
Andrew Kaster
8aac8f25ba Meta: Handle removed and renamed libraries in gn build 2024-09-27 10:15:08 -06:00
Jelle Raaijmakers
85fd2e281b LibMedia: Absorb LibAudio
LibMedia will be responsible for both audio and video decoding.
2024-09-12 10:01:19 +02:00
Sam Atkins
8cbc211616 Meta: Make embed_as_string_view.py produce Strings instead
This is only used for CSS style sheets. One case wants it as a String,
and the others don't care, but will in future also want to have the
source as a String.
2024-09-03 10:12:07 +01:00
Sam Atkins
6a74b01644 LibWeb: Rename "identifier" and "ValueID" to "Keyword" where correct
For a long time, we've used two terms, inconsistently:
- "Identifier" is a spec term, but refers to a sequence of alphanumeric
  characters, which may or may not be a keyword. (Keywords are a
  subset of all identifiers.)
- "ValueID" is entirely non-spec, and is directly called a "keyword" in
  the CSS specs.

So to avoid confusion as much as possible, let's align with the spec
terminology. I've attempted to change variable names as well, but
obviously we use Keywords in a lot of places in LibWeb and so I may
have missed some.

One exception is that I've not renamed "valid-identifiers" in
Properties.json... I'd like to combine that and the "valid-types" array
together eventually, so there's no benefit to doing an extra rename
now.
2024-08-15 13:58:38 +01:00
Matthew Olsson
ac35f76e67 Meta: Remove GenerateCSSEasingFunctions 2024-06-16 07:12:46 +02:00
Jamie Mansfield
8f0d035145 LibWeb: Implement should block mixed content request 2024-06-07 09:50:30 +02:00
Andreas Kling
f2fd8fc928 Everywhere: Remove LibGemini
This hasn't been maintained (or worked at all) for a long time,
and it's not a widely supported protocol, so let's drop it.
2024-06-04 09:19:39 +02:00
Andreas Kling
e4cd91761d Everywhere: Remove LibMarkdown
This was used to convert markdown into HTML for display in the browser,
but no other browser behaves this way, so let's simplify things by
removing it.

(Yes, we could implement all kinds of "convert to HTML and display" for
every file format out there, but that's far outside the scope of a
browser engine.)
2024-06-04 09:19:39 +02:00
Andreas Kling
e70d96e4e7 Everywhere: Remove a lot more things we don't need 2024-06-03 10:53:53 +02:00
Timothy Flynn
f3e04085d8 Meta: Port recent changes to the GN build
57714fbb38
76418f3ffa
8d5665ebe1
3aa36caa52
bfa330914d
2024-05-24 08:47:26 -04:00
Timothy Flynn
4bf0ddad4c Meta: Remove the generated bindings forwarding header from the GN build
This was removed in commit 22705e3065.
2024-05-12 18:10:05 -04:00
Timothy Flynn
0bd5e94958 Meta: Add missing IDL files to the GN build
Unclear why these being missing did not cause a compilation error.
2024-05-12 15:38:18 -06:00
Andrew Kaster
a0623a47de LibWeb: Implement importKey for RSA-OAEP 2024-03-25 17:01:23 -06:00
Timothy Flynn
a729677561 Meta: Port recent changes to the GN build
6c26ff567e
e800605ad3
cf7c933312
2024-03-23 17:26:31 -04:00
Timothy Flynn
e1e42f5f58 Meta: Port recent changes to the GN build
f9e5b43b7a
b9dc2d7ebf
fa692ae3f6
2024-02-29 22:16:39 -05:00
Andreas Kling
f900957d26 LibGfx+LibWeb: Move Gfx::ScaledFont caching from LibWeb into LibGfx
Before this change, we would only cache and reuse Gfx::ScaledFont
instances for downloaded CSS fonts.

By moving it into Gfx::VectorFont, we get caching for all vector fonts,
including local system TTFs etc.

This avoids a *lot* of style invalidations in LibWeb, since we now vend
the same Gfx::Font pointer for the same font when used repeatedly.
2023-12-26 18:15:55 +01:00
Timothy Flynn
50406f541b Meta: Port recent changes to GN build
124c378472
2023-11-15 11:28:39 -05:00
Timothy Flynn
652bbe172b Meta: Port recent changes to GN build
86ee7d219e
75caccafa4
1b30b510b9
bb43bd2dee
6322d68b1b
73ef102b01
923027b1df
dfa79ba6d8
66c9696687
521f8bd5f2
734076946b
0df06ce273
1ca46afa2f
66bd75f2b9
43dc9dfb69
4b94b0b561
4c5d48f861
c4efc0a5aa
3999c74237
4d356cfca5
2023-11-14 09:36:36 -05:00
Andrew Kaster
9a3e9047a5 Meta: Update GN build for recent changes
9a026fc8d5
ae1ac9871b
07b332e17c

And some missed changes from the past that weren't hit because no one
was referencing those symbols :^).
2023-10-11 10:56:24 -06:00
Sam Atkins
ae4b8d86df LibWeb: Include standard SVG user agent style sheet
For now, part of this is commented-out. Our current implementations of
`<mask>` and `<symbol>` rely on creating layout nodes, so they can't be
`display: none`.
2023-09-23 16:27:14 +02:00
Sebastian Zaha
b4df4d66dc Meta: Port recent build changes to gn build
This ports the following commits:

f76c614a84
ddbe6bd7b4
2eaa528a0e
1b40bf9783
9f6ceff7cf
52d6df5ee5
9e22f01eba
bf4e2f3e9c
da2cd73bcf
2023-08-19 20:09:26 -06:00
Sam Atkins
f76c614a84 LibWeb: Generate PseudoClass metadata
The usual to/from-string functions, and metadata about whether the
pseudo-class is a a function or not, and what type of parameter it
takes.
2023-08-12 16:26:32 +02:00
Jonah
0b2da4f8c6 LibWeb: Add the default user agent MathML stylesheet
We now apply MathML's default user agent style sheet along with other
default styles. This sheet is not mixed in with the other styles in
CSS/Default.css because it is a namespaced stylesheet and so has to
be its own sheet.
2023-08-12 07:59:23 +01:00
Andrew Kaster
2e0e393889 Meta: Port recent build changes to gn build
This ports the following commits:

e8a63eeb0e
bec07d4af7
0e12503586
2023-07-27 12:08:22 -06:00
Andrew Kaster
57ad638dcc Meta: Port 618c0402 to gn build 2023-07-17 00:00:49 +01:00
Andrew Kaster
a21a08cc9d Meta: Format LibWeb gn files
This was missed when merging the initial set. Linter should be next :^)
2023-07-13 14:07:25 -06:00
Andrew Kaster
2f16aa45b7 Meta: Port dd073b2711 to gn build 2023-07-13 14:07:25 -06:00
Andrew Kaster
85c8cd5205 Meta: Add gn build rules for LibWeb 2023-07-09 16:22:58 -06:00