Commit graph

3 commits

Author SHA1 Message Date
Timothy Flynn
00182a2405 LibJS: Port the JS lexer and parser to UTF-16
This ports the lexer to UTF-16 and deals with the immediate fallout up
to the AST. The AST will be dealt with in upcoming commits.

The lexer will still accept UTF-8 strings as input, and will transcode
them to UTF-16 for lexing. This doesn't actually incur a new allocation,
as we were already converting the input StringView to a ByteString for
each lexer.

One immediate logical benefit here is that we do not need to know off-
hand how many UTF-8 bytes some special code points occupy. They all
happen to be a single UTF-16 code unit. So instead of advancing the
lexer by 3 positions in some cases, we can just always advance by 1.
2025-08-13 09:56:13 -04:00
Sam Atkins
9b7fb0850d LibJS+LibWebView: Treat trivia tokens as comments
Trivia is whatever whitespace and comments appear before a token.
Previously this was always given a TokenCategory of Invalid, so it
would be displayed as an error in the view-source page, with red wiggly
underlines. Instead, treat it as what it actually is: whitespace and
comments!
2025-03-04 15:54:03 -05:00
Timothy Flynn
93712b24bf Everywhere: Hoist the Libraries folder to the top-level 2024-11-10 12:50:45 +01:00
Renamed from Userland/Libraries/LibJS/SyntaxHighlighter.cpp (Browse further)