LibWeb: Treat DOM::Range offsets as UTF-16 code unit offsets

We generated `PaintableFragment`s with a start and length represented in
UTF-8 byte offsets, but failed to consider that the offsets in a
`DOM::Range` are actually expressed in UTF-16 code units.

This is a bit of a mess: almost all web specs use UTF-16 code units as
the unit for indexing into text nodes, but we almost exclusively use
UTF-8 in our code base. Arguably the best thing would for us to use
UTF-16 everywhere as well: it prevents these mismatches in our
implementations for the price of a bit more memory usage - and even that
could potentially be optimized for.

But for now, try to do the correct thing and lazily allocate UTF-16 data
in a `PaintableFragment` whenever we need to index into it or if we're
asked to determine the code unit offset of a pixel position.
This commit is contained in:
Jelle Raaijmakers 2025-06-12 13:19:52 +02:00 committed by Jelle Raaijmakers
parent dbbdf2cebc
commit 3df83dade8
Notes: github-actions[bot] 2025-06-13 13:09:49 +00:00
6 changed files with 110 additions and 36 deletions

View file

@ -212,7 +212,7 @@ static CSSPixelPoint compute_mouse_event_offset(CSSPixelPoint position, Painting
}
// https://drafts.csswg.org/css-ui/#propdef-user-select
static void set_user_selection(GC::Ptr<DOM::Node> anchor_node, unsigned anchor_offset, GC::Ptr<DOM::Node> focus_node, unsigned focus_offset, Selection::Selection* selection, CSS::UserSelect user_select)
static void set_user_selection(GC::Ptr<DOM::Node> anchor_node, size_t anchor_offset, GC::Ptr<DOM::Node> focus_node, size_t focus_offset, Selection::Selection* selection, CSS::UserSelect user_select)
{
// https://drafts.csswg.org/css-ui/#valdef-user-select-contain
// NOTE: This is clamping the focus node to any node with user-select: contain that stands between it and the anchor node.