mirror of
https://github.com/LadybirdBrowser/ladybird.git
synced 2025-04-28 23:39:02 +00:00
LibWeb: Fix numeric character reference at EOF leaking its last digit
Previously, if the NumericCharacterReferenceEnd state was reached when current_input_character was None, then the DONT_CONSUME_NEXT_INPUT_CHARACTER macro would restore back before the EOF, and allow the next state (after the SWITCH_TO_RETURN_STATE) to proceed with the last digit of the numeric character reference. For example, with something like `ї`, before this commit the output would incorrectly be `<code point with the value 1111>1` instead of just `<code point with the value 1111>`. Instead of putting the `if (current_input_character.has_value())` check inside NumericCharacterReferenceEnd directly, it was instead added to DONT_CONSUME_NEXT_INPUT_CHARACTER, because all usages of the macro benefit from this check, even if the other existing usage sites don't exhibit any bugs without it: - In MarkupDeclarationOpen, if the current_input_character is EOF, then the previous character is always `!`, so restoring and then checking forward for strings like `--`, `DOCTYPE`, etc won't match and the BogusComment state will run one extra time (once for `!` and once for EOF) with no practical consequences. With the `has_value()` check, BogusComment will only run once with EOF. - In AfterDOCTYPEName, ConsumeNextResult::RanOutOfCharacters can only occur when stopping at the insertion point, and because of how the code is structured, it is guaranteed that current_input_character is either `P` or `S`, so the `has_value()` check is irrelevant.
This commit is contained in:
parent
752deaf6ef
commit
df87a9689c
Notes:
github-actions[bot]
2025-01-06 23:44:49 +00:00
Author: https://github.com/squeek502
Commit: df87a9689c
Pull-request: https://github.com/LadybirdBrowser/ladybird/pull/3163
Reviewed-by: https://github.com/gmta ✅
3 changed files with 15 additions and 6 deletions
|
@ -94,9 +94,10 @@ namespace Web::HTML {
|
|||
} \
|
||||
} while (0)
|
||||
|
||||
#define DONT_CONSUME_NEXT_INPUT_CHARACTER \
|
||||
do { \
|
||||
restore_to(m_prev_utf8_iterator); \
|
||||
#define DONT_CONSUME_NEXT_INPUT_CHARACTER \
|
||||
do { \
|
||||
if (current_input_character.has_value()) \
|
||||
restore_to(m_prev_utf8_iterator); \
|
||||
} while (0)
|
||||
|
||||
#define ON(code_point) \
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue