LibWeb: Fix off by one error in HTML Tokenizer

In 'NamedCharacterReference' we attempt to lookup the code point by a
identifier, eg apos; becomes '

This is done by passing the entire rest of the document to the
`HTML::code_points_from_entity` function.

However, before this change we didn't sent the final character which
meant if the document ended in a named character reference the lookup
would fail.
This commit is contained in:
Adam Hodgen 2022-02-18 22:12:47 +00:00 committed by Andreas Kling
parent c6dd8a1f66
commit c6fcdd0f93
Notes: sideshowbarker 2024-07-17 18:27:21 +09:00

View file

@ -1617,7 +1617,7 @@ _StartOfFunction:
{
size_t byte_offset = m_utf8_view.byte_offset_of(m_prev_utf8_iterator);
auto match = HTML::code_points_from_entity(m_decoded_input.substring_view(byte_offset, m_decoded_input.length() - byte_offset - 1));
auto match = HTML::code_points_from_entity(m_decoded_input.substring_view(byte_offset, m_decoded_input.length() - byte_offset));
if (match.has_value()) {
skip(match->entity.length() - 1);