Commit graph

34 commits

Author SHA1 Message Date
Tim Ledbetter
a187d5f28f LibWeb/SVG: Process script element when its text content changes 2025-02-26 16:08:35 +01:00
Sam Atkins
334c55c999 LibWeb/HTML: Add missing spec text/link for insert_html_element() 2025-02-20 21:55:34 +00:00
Sam Atkins
91fe0e1a1a LibWeb/HTML: Spec-ify reset_the_insertion_mode_appropriately()
It was already to-spec, with a couple of non-behavior-affecting changes,
but now it's clearer that this is the case. :^)
2025-02-20 21:55:34 +00:00
Sam Atkins
d7b4e1f2e3 LibWeb: Add missing text to find_appropriate_place_for_inserting_node() 2025-02-20 21:55:34 +00:00
Andreas Kling
550613e526 LibWeb: Remember when HTML parser should ignore next line feed character
There's a quirk in HTML where the parser should ignore any line feed
character immediately following a `pre` or `textarea` start tag.

This was working fine when we could peek ahead in the input stream and
see the next token, but didn't work in character-at-a-time parsing with
document.write().

This commit adds the "can ignore next line feed character" as a parser
flag that is maintained across invocations, making it work in this
parsing mode as well.

20 new passes in WPT/html/syntax/parsing/ :^)
2025-02-20 14:32:13 +01:00
Andreas Kling
611833429a LibWeb: Lazily merge text nodes when invoking HTML parser incrementally
Instead of always inserting a new text node, we now continue appending
to an extisting text node if the parser's character insertion point is
a suitable text node.

This fixes an issue where multiple invocations of document.write() would
create unnecessary sequences of text nodes. Such sequences are now
merged automatically.

19 new passes in WPT/html/syntax/parsing/ :^)
2025-02-20 14:32:13 +01:00
Andreas Kling
0c0fe09e70 LibWeb: Preserve attributes in "reconstruct active formatting elements"
25 new passes in WPT/html/syntax/parsing/ :^)
2025-02-20 14:32:13 +01:00
Andreas Kling
7549f6842f LibWeb: Ensure "frameset ok" flag is disabled after parsing pre tag
2 new passes in WPT/html/syntax/parsing/ :^)
2025-02-20 14:32:13 +01:00
Andreas Kling
2e59dbd10f LibWeb: Don't process frameset token twice in "in body" insertion mode
We were neglecting to return after handling the `frameset` start tag,
which caused us to process it twice, once properly and once generically.

54 new passes in WPT/html/syntax/parsing/ :^)
2025-02-20 14:32:13 +01:00
Andreas Kling
6f7b865cd1 LibWeb: Let HTML parser handle EOF inserted by document.close()
Before this change, the explicit EOF inserted by document.close() would
instantly abort the parser. This meant that parsing algorithms that ran
as part of the parser unwinding on EOF would never actually run.

591 new passes in WPT/html/syntax/parsing/ :^)

This exposed a problem where the parser would try to insert a root
<html> element on EOF in a document where someone already inserted such
an element via direct DOM manipulation. The parser now gracefully
handles this scenario. It's covered by existing tests (which would
crash without this change.)
2025-02-20 14:32:13 +01:00
Shannon Booth
7441aa34e4 LibWeb/HTML: Bail from HTML parsing when EOF hit on document.close
This fixes a crash in the included test that regressed in 0adf261,
and is hit by the following HTML:

```html
<body></body>
<script>
  const frame = document.body.appendChild(document.createElement("iframe"));
  frame.contentDocument.open();
  const child = frame.contentDocument.createElement("html")
  const html = frame.contentDocument.appendChild(child);
  frame.contentDocument.close();
</script>
```

I am not 100% sure this is fully the correct fix and there are other
cases which would not work properly. But it's definitely an improvement
to make the confuisingly named 'insert_an_eof' function of the tokenizer
actually do something.
2025-02-09 19:20:09 +00:00
Andreas Kling
4f855286d7 LibWeb: Clamp layout content sizes to a max value instead of crashing
We've historically asserted that no "saturated" size values end up as
final metrics for boxes in layout. This always had a chance of producing
false positives, since you can trivially create extremely large boxes
with CSS.

The reason we had those assertions was to catch bugs in our own engine
code where we'd incorrectly end up with non-finite values in layout
algorithms. At this point, we've found and fixed all known bugs of that
nature, and what remains are a bunch of false positives on pages that
create very large scrollable areas, iframes etc.

So, let's change it! We now clamp content width and height of boxes to
17895700 pixels, apparently the same cap as Firefox uses.

There's also the issue of calc() being able to produce non-finite
values. Note that we don't clamp the result of calc() directly, but
instead just clamp values when assigning them to content sizes.

Fixes #645.
Fixes #1236.
Fixes #1249.
Fixes #1908.
Fixes #3057.
2025-02-05 18:28:55 +01:00
Tim Ledbetter
8dfd382e12 LibWeb: Use as_if instead of dynamic_cast in a few places 2025-01-31 14:29:48 +01:00
Timothy Flynn
85b424464a AK+Everywhere: Rename verify_cast to as
Follow-up to fc20e61e72.
2025-01-21 11:34:06 -05:00
Shannon Booth
51102254b5 LibWeb/HTML: Scroll to the fragment before loading the document
Otherwise nowhere ends up scrolling to the fragment specified by the
fragment in document's URL. This fixes ladybird scrolling to the
correct location in the document when navigating to a link that
has a fragment, e.g:

https://html.spec.whatwg.org/multipage/browsing-the-web.html#try-to-scroll-to-the-fragment

As well as use of the :target selector.
2025-01-15 12:43:48 +00:00
Shannon Booth
64eeda6450 LibWeb: Return a representation of an 'Agent' in 'relevant agent'
This makes it more convenient to use the 'relvant agent' concept,
instead of the awkward dynamic casts we needed to do for every call
site.

mutation_observers is also changed to hold a GC::Root instead of raw
GC::Ptr. Somehow this was not causing problems before, but trips up CI
after these changes.
2025-01-11 10:39:48 -05:00
Tim Ledbetter
f8b8c9c4a4 LibWeb: Wait until ReadyState is complete before detaching HTML parser
Previously, the DOM complete time was never being set, as the HTML
parser was detached before `DocumentReadyState` was set to complete.
2025-01-11 11:11:52 +01:00
Tim Ledbetter
983540a244 LibWeb: Add missing spec links to HTML parser 2025-01-08 12:05:08 +00:00
Sam Atkins
89296b88a0 LibWeb: Bring fragment parsing up to spec
Corresponds to https://github.com/whatwg/html/pull/10874

Also, parse_fragment() returns ExceptionOr, so stop voiding the error
from append_child().
2025-01-07 16:05:59 +01:00
Sam Atkins
4c5a40579a LibWeb: Match spec changes to "create an element" algorithm
Corresponds to https://github.com/whatwg/html/pull/10857

No code changes, only comments.
2024-12-18 19:22:44 +00:00
Sam Atkins
30d23ad090 LibWeb/HTML: Use camelCase for variables in adoption agency comments
This is a spec change but I don't know when it occurred.

Also reformat some of the spec comments for clarity.

No code changes, only comments.
2024-12-18 19:22:44 +00:00
Andrew Kaster
66519af43f LibWeb: Remove some uses of [&] lambda captures for queued tasks
Using a default reference capture for these kinds of tasks is dangerous
and prone to error. Some of the variables should for sure be captured
by value so that we can keep a GC object alive rather than trying to
refer to stack objects.
2024-12-10 07:13:00 +01:00
Jelle Raaijmakers
1514197e36 LibWeb: Remove dom_ from dom_exception_to_throw_completion
We're not converting `WebIDL::DOMException`, but `WebIDL::Exception`
instead.
2024-12-09 20:02:51 -07:00
joshua stein
12442ca430 LibWeb: Remove a misleading duplicate comment in HTMLParser 2024-11-30 07:07:30 +00:00
Psychpsyo
e602578501 LibWeb: Add handling for 'an end tag whose tag name is sarcasm' 2024-11-27 10:42:58 +00:00
Psychpsyo
397579096f LibWeb: Add search element to list of Special tags 2024-11-27 11:00:58 +01:00
Gingeh
0adf261c32 LibWeb: Don't end parsing after reaching the insertion point 2024-11-26 23:50:18 +01:00
Andreas Kling
1d554f97de LibWeb: Reset the "stop parsing" flag when entering HTML parser
Otherwise we'll always bail after processing one token, which is not
what we want.
2024-11-23 19:19:31 +01:00
Andreas Kling
7309f2a6f9 LibWeb: Add spec comments to HTML parser's "initial" insertion mode 2024-11-23 19:19:31 +01:00
Psychpsyo
f09ed59351 LibWeb: Add the search element 2024-11-19 23:30:43 +00:00
Shannon Booth
f87041bf3a LibGC+Everywhere: Factor out a LibGC from LibJS
Resulting in a massive rename across almost everywhere! Alongside the
namespace change, we now have the following names:

 * JS::NonnullGCPtr -> GC::Ref
 * JS::GCPtr -> GC::Ptr
 * JS::HeapFunction -> GC::Function
 * JS::CellImpl -> GC::Cell
 * JS::Handle -> GC::Root
2024-11-15 14:49:20 +01:00
Shannon Booth
1e54003cb1 LibJS+LibWeb: Rename Heap::allocate_without_realm to Heap::allocate
Now that the heap has no knowledge about a JavaScript realm and is
purely for managing the memory of the heap, it does not make sense
to name this function to say that it is a non-realm variant.
2024-11-13 16:51:44 -05:00
Shannon Booth
9b79a686eb LibJS+LibWeb: Use realm.create<T> instead of heap.allocate<T>
The main motivation behind this is to remove JS specifics of the Realm
from the implementation of the Heap.

As a side effect of this change, this is a bit nicer to read than the
previous approach, and in my opinion, also makes it a little more clear
that this method is specific to a JavaScript Realm.
2024-11-13 16:51:44 -05:00
Timothy Flynn
93712b24bf Everywhere: Hoist the Libraries folder to the top-level 2024-11-10 12:50:45 +01:00
Renamed from Userland/Libraries/LibWeb/HTML/Parser/HTMLParser.cpp (Browse further)