mirror of
https://github.com/LadybirdBrowser/ladybird.git
synced 2025-08-11 02:29:21 +00:00
LibWeb: Treat DOM::Range
offsets as UTF-16 code unit offsets
We generated `PaintableFragment`s with a start and length represented in UTF-8 byte offsets, but failed to consider that the offsets in a `DOM::Range` are actually expressed in UTF-16 code units. This is a bit of a mess: almost all web specs use UTF-16 code units as the unit for indexing into text nodes, but we almost exclusively use UTF-8 in our code base. Arguably the best thing would for us to use UTF-16 everywhere as well: it prevents these mismatches in our implementations for the price of a bit more memory usage - and even that could potentially be optimized for. But for now, try to do the correct thing and lazily allocate UTF-16 data in a `PaintableFragment` whenever we need to index into it or if we're asked to determine the code unit offset of a pixel position.
This commit is contained in:
parent
dbbdf2cebc
commit
3df83dade8
Notes:
github-actions[bot]
2025-06-13 13:09:49 +00:00
Author: https://github.com/gmta
Commit: 3df83dade8
Pull-request: https://github.com/LadybirdBrowser/ladybird/pull/5067
Reviewed-by: https://github.com/tcl3
Reviewed-by: https://github.com/trflynn89
6 changed files with 110 additions and 36 deletions
|
@ -0,0 +1,15 @@
|
|||
<!DOCTYPE html>
|
||||
<script src="include.js"></script>
|
||||
😭foobar😭
|
||||
<script>
|
||||
test(() => {
|
||||
internals.mouseDown(55, 20);
|
||||
internals.movePointerTo(110, 20);
|
||||
|
||||
const activeRange = window.getSelection().getRangeAt(0);
|
||||
printElement(activeRange.startContainer);
|
||||
println(activeRange.startOffset);
|
||||
printElement(activeRange.endContainer);
|
||||
println(activeRange.endOffset);
|
||||
});
|
||||
</script>
|
Loading…
Add table
Add a link
Reference in a new issue