Commit graph

106 commits

Author SHA1 Message Date
Timothy Flynn
47f6bb38a1 LibRegex: Support UTF-16 RegexStringView and improve Unicode matching
When the Unicode option is not set, regular expressions should match
based on code units; when it is set, they should match based on code
points. To do so, the regex parser must combine surrogate pairs when
the Unicode option is set. Further, RegexStringView needs to know if
the flag is set in order to return code point vs. code unit based
string lengths and substrings.
2021-07-23 23:06:57 +01:00
Ali Mohammad Pur
f364fcec5d LibRegex+Everywhere: Make LibRegex more unicode-aware
This commit makes LibRegex (mostly) capable of operating on any of
the three main string views:
- StringView for raw strings
- Utf8View for utf-8 encoded strings
- Utf32View for raw unicode strings

As a result, regexps with unicode strings should be able to properly
handle utf-8 and not stop in the middle of a code point.
A future commit will update LibJS to use the correct type of string
depending on the flags.
2021-07-18 21:10:55 +04:30
Ali Mohammad Pur
e5af15a6e9 LibRegex: Don't do out-of-bound match accesses when a test fails 2021-07-18 21:10:55 +04:30
Timothy Flynn
65003241e4 LibRegex: Allow dollar signs in ECMA262 named capture groups
Fixes 1 test262 test.
2021-07-06 22:33:17 +01:00
sin-ack
9a9e7f03f2 Tests: Add test for case-insensitive matching 2021-06-16 16:30:12 +04:30
Brian Gianforcaro
6e918e4e02 Tests: Move LibRegex tests to Tests/LibRegex 2021-05-06 17:54:28 +02:00
Renamed from Userland/Libraries/LibRegex/Tests/Regex.cpp (Browse further)