Commit graph

17 commits

Author SHA1 Message Date
Sergey Bugaev
1274c244d5 LibJS: Fix out-of-bounds read when parsing escape sequences
We cannot look at i+1'th character until we verify it's there.
2020-06-01 17:37:44 +02:00
Matthew Olsson
e415dd4e9c LibJS: Handle hex and unicode escape sequences in string literals
Introduces the following syntax:

'\x55'
'\u26a0'
'\u{1f41e}'
2020-05-18 17:58:17 +02:00
mattco98
adb4accab3 LibJS: Add template literals
Adds fully functioning template literals. Because template literals
contain expressions, most of the work has to be done in the Lexer rather
than the Parser. And because of the complexity of template literals
(expressions, nesting, escapes, etc), the Lexer needs to have some
template-related state.

When entering a new template literal, a TemplateLiteralStart token is
emitted. When inside a literal, all text will be parsed up until a '${'
or '`' (or EOF, but that's a syntax error) is seen, and then a
TemplateLiteralExprStart token is emitted. At this point, the Lexer
proceeds as normal, however it keeps track of the number of opening
and closing curly braces it has seen in order to determine the close
of the expression. Once it finds a matching curly brace for the '${',
a TemplateLiteralExprEnd token is emitted and the state is updated
accordingly.

When the Lexer is inside of a template literal, but not an expression,
and sees a '`', this must be the closing grave: a TemplateLiteralEnd
token is emitted.

The state required to correctly parse template strings consists of a
vector (for nesting) of two pieces of information: whether or not we
are in a template expression (as opposed to a template string); and
the count of the number of unmatched open curly braces we have seen
(only applicable if the Lexer is currently in a template expression).

TODO: Add support for template literal newlines in the JS REPL (this will
cause a syntax error currently):

    > `foo
    > bar`
    'foo
    bar'
2020-05-04 16:46:31 +02:00
Linus Groh
95b51e857d LibJS: Add TokenType::TemplateLiteral
This is required for template literals - we're not quite there yet, but at
least the parser can now tell us when this token is encountered -
currently this yields "Unexpected token Invalid". Not really helpful.

The character is a "backtick", but as we already have
TokenType::{StringLiteral,RegexLiteral} this seemed like a fitting name.

This also enables syntax highlighting for template literals in the js
REPL and LibGUI's JSSyntaxHighlighter.
2020-04-24 11:18:57 +02:00
Stephan Unverwerth
bf5b251684 LibJS: Allow reserved words as keys in object expressions. 2020-04-18 22:23:20 +02:00
Stephan Unverwerth
500f6d9e3a LibJS: Add numeric literal parsing for different bases and exponents 2020-04-05 16:01:22 +02:00
Andreas Kling
a860a3f793 LibJS: Hack the lexer to allow numbers with decimals
This is very hackish and should definitely be improved. :^)
2020-04-04 23:13:48 +02:00
Andreas Kling
a1c718e387 LibJS: Use some macro magic to avoid duplicating all the token types 2020-03-30 13:11:07 +02:00
Andreas Kling
1923051c5b LibJS: Lexer and parser support for "switch" statements 2020-03-29 15:03:58 +02:00
0xtechnobabble
bc002f807a LibJS: Parse object expressions 2020-03-21 10:08:58 +01:00
0xtechnobabble
cfd710eb31 LibJS: Implement null and undefined literals 2020-03-16 13:42:13 +01:00
Stephan Unverwerth
3389021291 LibJS: Unescape strings in Token::string_value() 2020-03-14 16:00:28 +01:00
Andreas Kling
f94099f796 LibJS: Strip double-quote characters from StringLiteral tokens
This is very hackish since I'm just doing it to make progress on
something else. :^)
2020-03-14 12:40:06 +01:00
Stephan Unverwerth
c0e6234219 LibJS: Lex single quote strings, escaped chars and unterminated strings 2020-03-14 12:13:53 +01:00
Oriko
e273203d27 LibJS: Add missing tokens to name() 2020-03-14 11:30:31 +01:00
Stephan Unverwerth
15d5b2d29e LibJS: Add operator precedence parsing
Obey precedence and associativity rules when parsing expressions
with chained operators.
2020-03-14 00:11:24 +01:00
Stephan Unverwerth
f3a9eba987 LibJS: Add Javascript lexer and parser
This adds a basic Javascript lexer and parser. It can parse the
currently existing demo programs. More work needs to be done to
turn it into a complete parser than can parse arbitrary JS Code.

The lexer outputs tokens with preceeding whitespace and comments
in the trivia member. This should allow us to generate the exact
source code by concatenating the generated tokens.

The parser is written in a way that it always returns a complete
syntax tree. Error conditions are represented as nodes in the
tree. This simplifies the code and allows it to be used as an
early stage parser, e.g for parsing JS documents in an IDE while
editing the source code.:
2020-03-12 09:25:49 +01:00