AK+Everywhere: Recognise that surrogates in utf16 aren't all that common

For the slight cost of counting code points when converting between
encodings and a teeny bit of memory, this commit adds a fast path for
all-happy utf-16 substrings and code point operations.

This seems to be a significant chunk of time spent in many regex
benchmarks.
This commit is contained in:
Ali Mohammad Pur 2025-04-02 17:56:49 +02:00 committed by Andrew Kaster
parent 86c756a589
commit eea81738cd
Notes: github-actions[bot] 2025-04-23 13:57:06 +00:00
11 changed files with 74 additions and 37 deletions

View file

@ -54,7 +54,7 @@ ByteString SVGTextContentElement::text_contents() const
// https://svgwg.org/svg2-draft/text.html#__svg__SVGTextContentElement__getNumberOfChars
WebIDL::ExceptionOr<WebIDL::Long> SVGTextContentElement::get_number_of_chars() const
{
auto chars = TRY_OR_THROW_OOM(vm(), utf8_to_utf16(text_contents()));
auto chars = TRY_OR_THROW_OOM(vm(), utf8_to_utf16(text_contents())).data;
return static_cast<WebIDL::Long>(chars.size());
}