LibWeb/HTML: Bail from HTML parsing when EOF hit on document.close

This fixes a crash in the included test that regressed in 0adf261,
and is hit by the following HTML:

```html
<body></body>
<script>
  const frame = document.body.appendChild(document.createElement("iframe"));
  frame.contentDocument.open();
  const child = frame.contentDocument.createElement("html")
  const html = frame.contentDocument.appendChild(child);
  frame.contentDocument.close();
</script>
```

I am not 100% sure this is fully the correct fix and there are other
cases which would not work properly. But it's definitely an improvement
to make the confuisingly named 'insert_an_eof' function of the tokenizer
actually do something.
This commit is contained in:
Shannon Booth 2025-02-09 12:47:50 +13:00 committed by Tim Ledbetter
parent 9e556972ae
commit 7441aa34e4
Notes: github-actions[bot] 2025-02-09 19:21:10 +00:00
4 changed files with 95 additions and 0 deletions

View file

@ -199,6 +199,9 @@ void HTMLParser::run(HTMLTokenizer::StopAtInsertionPoint stop_at_insertion_point
dbgln_if(HTML_PARSER_DEBUG, "[{}] {}", insertion_mode_name(), token.to_string());
if (token.is_end_of_file() && m_tokenizer.is_eof_inserted())
break;
// https://html.spec.whatwg.org/multipage/parsing.html#tree-construction-dispatcher
// As each token is emitted from the tokenizer, the user agent must follow the appropriate steps from the following list, known as the tree construction dispatcher:
if (m_stack_of_open_elements.is_empty()