Ali Mohammad Pur
3b4a184f1a
LibRegex: Avoid hashing the state hashes again
...
CI / Lagom (arm64, Sanitizer_CI, false, macos-15, macOS, Clang) (push) Waiting to run
CI / Lagom (x86_64, Fuzzers_CI, false, ubuntu-24.04, Linux, Clang) (push) Waiting to run
CI / Lagom (x86_64, Sanitizer_CI, false, ubuntu-24.04, Linux, GNU) (push) Waiting to run
CI / Lagom (x86_64, Sanitizer_CI, true, ubuntu-24.04, Linux, Clang) (push) Waiting to run
Package the js repl as a binary artifact / build-and-package (arm64, macos-15, macOS, macOS-universal2) (push) Waiting to run
Package the js repl as a binary artifact / build-and-package (x86_64, ubuntu-24.04, Linux, Linux-x86_64) (push) Waiting to run
Run test262 and test-wasm / run_and_update_results (push) Waiting to run
Lint Code / lint (push) Waiting to run
Label PRs with merge conflicts / auto-labeler (push) Waiting to run
Push notes / build (push) Waiting to run
We already had a really nice hash that had a single issue, this commit
fixes that and makes it *the* hash for the hash table, so we avoid
double-hashing and making a long chain.
This is an easy 10% perf gain.
2025-04-18 17:09:27 +02:00
Ali Mohammad Pur
446a453719
LibRegex: Pull out the first compare to avoid unnecessary execution
...
This adds a fast-path to drop view indices we know will not match
immediately without going through the regex VM.
2025-04-18 17:09:27 +02:00
Ali Mohammad Pur
76f5dce3db
LibRegex: Flatten capture group list in MatchState
...
This makes copying the capture group COWVector significantly cheaper,
as we no longer have to run any constructors for it - just memcpy.
2025-04-18 17:09:27 +02:00
Andreas Kling
ca2f0141f6
LibRegex: Remove unused "simple substring search" optimization
...
CI / Lagom (arm64, Sanitizer_CI, false, macos-15, macOS, Clang) (push) Waiting to run
CI / Lagom (x86_64, Fuzzers_CI, false, ubuntu-24.04, Linux, Clang) (push) Waiting to run
CI / Lagom (x86_64, Sanitizer_CI, false, ubuntu-24.04, Linux, GNU) (push) Waiting to run
CI / Lagom (x86_64, Sanitizer_CI, true, ubuntu-24.04, Linux, Clang) (push) Waiting to run
Package the js repl as a binary artifact / build-and-package (arm64, macos-15, macOS, macOS-universal2) (push) Waiting to run
Package the js repl as a binary artifact / build-and-package (x86_64, ubuntu-24.04, Linux, Linux-x86_64) (push) Waiting to run
Run test262 and test-wasm / run_and_update_results (push) Waiting to run
Lint Code / lint (push) Waiting to run
Label PRs with merge conflicts / auto-labeler (push) Waiting to run
Push notes / build (push) Waiting to run
This is not relevant for LibJS since it only works when the input is
UTF-8, and LibJS always provides UTF-16.
2025-04-16 10:04:50 +02:00
Andreas Kling
96f1f15ad6
LibRegex: Remove unused Utf8View/Utf32View support in RegexStringView
2025-04-16 10:04:50 +02:00
Andreas Kling
87ec5b32b0
LibRegex: Use ReadonlySpan to peek into OpCode_Compare LUTs
...
CI / Lagom (x86_64, Fuzzers_CI, false, ubuntu-24.04, Linux, Clang) (push) Waiting to run
CI / Lagom (arm64, Sanitizer_CI, false, macos-15, macOS, Clang) (push) Waiting to run
CI / Lagom (x86_64, Sanitizer_CI, false, ubuntu-24.04, Linux, GNU) (push) Waiting to run
CI / Lagom (x86_64, Sanitizer_CI, true, ubuntu-24.04, Linux, Clang) (push) Waiting to run
Package the js repl as a binary artifact / build-and-package (arm64, macos-15, macOS, macOS-universal2) (push) Waiting to run
Package the js repl as a binary artifact / build-and-package (x86_64, ubuntu-24.04, Linux, Linux-x86_64) (push) Waiting to run
Run test262 and test-wasm / run_and_update_results (push) Waiting to run
Lint Code / lint (push) Waiting to run
Label PRs with merge conflicts / auto-labeler (push) Waiting to run
Push notes / build (push) Waiting to run
By the time we're executing bytecode, we know the the bytecode will be
flattened. This means we can use ReadonlySpan to look into it instead of
DisjointChunks::spans(), which allocates.
2025-04-14 17:40:13 +02:00
Andreas Kling
c1c3b01a6c
LibRegex: Allow Vector<Match> to use trivial memcpy
...
Now that Match has no more members that need destruction, we can allow
Vector to memcpy them around.
2025-04-14 17:40:13 +02:00
Andreas Kling
5308d77600
LibRegex: Don't use Optional<T> inside regex::Match
...
This prevented Match from being trivially copyable, which we want it to
be for fast Vector copying.
2025-04-14 17:40:13 +02:00
Andreas Kling
54edf29f1b
LibRegex: Make Match::capture_group_name an index into the string table
...
This removes another Match member that required destruction. The "API"
for accessing the strings is definitely a bit awkward. We'll think of
something nicer eventually.
2025-04-14 17:40:13 +02:00
Andreas Kling
9d47cc54f8
LibRegex: Remove unused regex::Match::string and unused constructor
...
This shrinks regex::Match by 8 bytes and removes a member that needs
destruction.
2025-04-14 17:40:13 +02:00
Ali Mohammad Pur
69050da929
LibRegex: Merge inverse string table mappings separately
CI / Lagom (arm64, Sanitizer_CI, false, macos-15, macOS, Clang) (push) Waiting to run
CI / Lagom (x86_64, Fuzzers_CI, false, ubuntu-24.04, Linux, Clang) (push) Waiting to run
CI / Lagom (x86_64, Sanitizer_CI, false, ubuntu-24.04, Linux, GNU) (push) Waiting to run
CI / Lagom (x86_64, Sanitizer_CI, true, ubuntu-24.04, Linux, Clang) (push) Waiting to run
Package the js repl as a binary artifact / build-and-package (arm64, macos-15, macOS, macOS-universal2) (push) Waiting to run
Package the js repl as a binary artifact / build-and-package (x86_64, ubuntu-24.04, Linux, Linux-x86_64) (push) Waiting to run
Run test262 and test-wasm / run_and_update_results (push) Waiting to run
Lint Code / lint (push) Waiting to run
Label PRs with merge conflicts / auto-labeler (push) Waiting to run
Push notes / build (push) Waiting to run
2025-04-06 20:21:16 +02:00
Ali Mohammad Pur
299b9ca572
LibRegex: Check backreference index before looking it up
...
If a backref happens after it's cleared, the slot may be cleared
already.
2025-04-06 20:21:16 +02:00
Jess
83e46b3728
LibRegex: Fix crash when parse result exceeds max cache size
...
Before, If the cache was empty we would try and evict non-existant
entries and crash. So the fix is to make sure that we don't saturate
the cache with a single parse result.
2025-04-04 16:10:25 +02:00
Ali Mohammad Pur
4136d8d13e
LibRegex: Use an interned string table for capture group names
...
This avoids messing around with unsafe string pointers and removes the
only non-FlyString-able user of DeprecatedFlyString.
2025-04-02 11:43:13 +02:00
Andreas Kling
e5db913b0d
Revert "LibRegex: Port remaining DeprecatedFlyString to ByteString"
...
This reverts commit aab3fbe254
.
Greatly regressed JavaScript benchmark performance.
2025-04-01 15:40:38 +02:00
Andreas Kling
7c32d1e8a5
Revert "Everywhere: Remove DeprecatedFlyString + any remaining references to it"
...
This reverts commit 3131e6369f
.
Greatly regressed JavaScript benchmark performance.
2025-04-01 15:40:27 +02:00
Kenneth Myhra
3131e6369f
Everywhere: Remove DeprecatedFlyString + any remaining references to it
2025-04-01 12:50:00 +02:00
Kenneth Myhra
aab3fbe254
LibRegex: Port remaining DeprecatedFlyString to ByteString
2025-04-01 12:50:00 +02:00
Andreas Kling
6b6d3b32a4
LibRegex: Remove the StringCopyMatches mode
...
This mode made a lot of incorrect assumptions about string lifetimes,
and instead of fixing it, let's just remove it and tweak the few unit
tests that used it.
2025-03-24 22:27:17 +00:00
Andreas Kling
46a5710238
LibJS: Use FlyString in PropertyKey instead of DeprecatedFlyString
...
This required dealing with *substantial* fallout.
2025-03-24 22:27:17 +00:00
mikiubo
c85df78c4c
LibRegex: Remove orphaned save points in nested LookAhead
2025-03-17 16:11:02 +01:00
Tim Ledbetter
b9ac99d2eb
Revert "LibRegex: Remove orphaned save points in nested LookAhead"
...
This reverts commit f2678bfcb8
.
2025-03-14 19:57:33 +00:00
mikiubo
f2678bfcb8
LibRegex: Remove orphaned save points in nested LookAhead
2025-03-14 09:41:41 +01:00
Ali Mohammad Pur
5355710481
LibRegex: Don't treat single-jump blocks as noop in the optimizer
2025-03-09 14:37:57 +01:00
aplefull
389a63d6bf
LibRegex: Allow duplicate named capture groups in separate alternatives
2025-03-05 14:36:09 +01:00
aplefull
61744322ad
LibRegex: Ensure nullable quantifiers backtrack when input remains
...
Makes patterns like `/(a?b??)*/` correctly match the string
2025-03-02 15:19:04 +01:00
Ali Mohammad Pur
ea3b7efd91
LibRegex: Treat the UnicodeSets flag as Unicode
...
Fixes /.../v not being interpreted as a unicode pattern.
2025-02-28 14:31:45 -05:00
mikiubo
8a6f7b787e
LibRegex: Use depth-first search in regex optimizer
...
CI / Lagom (arm64, Sanitizer_CI, false, macos-15, macOS, Clang) (push) Waiting to run
CI / Lagom (x86_64, Fuzzers_CI, false, ubuntu-24.04, Linux, Clang) (push) Waiting to run
CI / Lagom (x86_64, Sanitizer_CI, false, ubuntu-24.04, Linux, GNU) (push) Waiting to run
CI / Lagom (x86_64, Sanitizer_CI, true, ubuntu-24.04, Linux, Clang) (push) Waiting to run
Package the js repl as a binary artifact / build-and-package (macos-14, macOS, macOS-universal2) (push) Waiting to run
Package the js repl as a binary artifact / build-and-package (ubuntu-24.04, Linux, Linux-x86_64) (push) Waiting to run
Run test262 and test-wasm / run_and_update_results (push) Waiting to run
Lint Code / lint (push) Waiting to run
Label PRs with merge conflicts / auto-labeler (push) Waiting to run
Push notes / build (push) Waiting to run
use depth-first search in optimizer code bacause using breadth-first
search generate a bug. Add test example in test lib.
2025-02-25 00:09:20 +01:00
Ali Mohammad Pur
08ebfaff17
LibRegex: Take trailing inversion state into account in block comparison
...
Fixes #3421 .
2025-02-01 11:30:02 +01:00
Timothy Flynn
85b424464a
AK+Everywhere: Rename verify_cast
to as
...
Follow-up to fc20e61e72
.
2025-01-21 11:34:06 -05:00
Ali Mohammad Pur
cce000d57c
LibRegex: Don't repeat the same fork again
...
If some state has already been tried, skip over it as it would never
lead to a match regardless.
This fixes performance/memory issues in cases like
/(a+)+b/.exec("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa")
or
/(a|a?)+b/...
Fixes #2622 .
2025-01-17 10:13:51 +01:00
Ali Mohammad Pur
7ceeb85ba7
LibRegex: Avoid use-after-move of trivial object
...
This is not an actual problem as the object is just an enum, but clion
was bugging me.
2025-01-17 10:13:51 +01:00
Ali Mohammad Pur
50733c564c
LibRegex: Use the *actually* correct repeat start offset for Repeat
...
Fixes #2931 and various frequent crashes.
2024-12-23 13:13:52 +01:00
Pavel Shliak
811d5a5c3e
LibRegex: Remove duplicated condition
2024-12-22 12:33:41 +01:00
Pavel Shliak
7dd7f77219
LibRegex: Remove duplicated assignments
2024-12-22 12:33:41 +01:00
Ali Mohammad Pur
eee90f4aa2
LibRegex: Treat checks against nonexistent checkpoints as empty
...
Due to optimiser shenanigans in the tree alternative form, some
JumpNonEmpty ops might be moved before their Checkpoint instruction.
It is safe to assume the distance between the nonexistent checkpoint and
the current op is zero, so just do that.
2024-12-13 10:00:16 +01:00
Ali Mohammad Pur
358378c1c0
LibRegex: Pick the right target for OpCode_Repeat
...
Repeat's 'offset' field is a bit odd in that it is treated as a negative
offset, causing a backwards jump when positive; the optimizer didn't
correctly model this behaviour, which caused crashes and misopts when
dealing with Repeats.
This commit fixes that behaviour.
2024-12-13 10:00:16 +01:00
Ali Mohammad Pur
4a8d3e35a3
LibRegex: Add some more debugging info to bytecode block ranges
...
These were getting difficult to differentiate, now they each get a
comment on where they came from to aid with future debugging.
2024-12-13 10:00:16 +01:00
Ali Mohammad Pur
f8092455e2
LibRegex: Print OpCode_Repeat's offset as ssize_t
2024-12-13 10:00:16 +01:00
Pavel Shliak
6f81b80114
Everywhere: Include HashMap only where it's actually used
2024-12-09 12:31:16 +01:00
Marc Jessome
efcaf991e6
LibRegex: Ensure nested capture groups have non-conflicting names
...
Take record of the named capture group prior to parsing the group's
body. This requires removal of the recorded minimum length of the named
capture group directly, and now needs to be looked up via the group
minimu lengths table.
2024-11-24 10:26:09 +01:00
Pavel Shliak
cdb54fe504
LibRegex: Clean up #include directives
...
This change aims to improve the speed of incremental builds.
2024-11-21 14:08:33 +01:00
Ali Mohammad Pur
5a4d657a4e
LibRegex: Avoid generating ForkJumps when jumping to the next alt block
...
Fixes #2398 .
2024-11-17 20:12:39 +01:00
Ali Mohammad Pur
00bc22c332
LibRegex: Don't immediately ignore TempInverse in optimizer
...
fe46b2c141
added the reset-temp-inverse flag, but set it up so all
tempinverse ops were negated at the start of the next op; this commit
makes it so these flags actually persist for one op and not zero.
Fixes #2296 .
2024-11-17 09:03:29 -05:00
Ali Mohammad Pur
dabd60180f
LibRegex: Don't ignore references that weren't bound in checked blocks
...
Fixes #2281 .
2024-11-12 10:37:57 +01:00
Timothy Flynn
93712b24bf
Everywhere: Hoist the Libraries folder to the top-level
2024-11-10 12:50:45 +01:00
Andreas Kling
13d7c09125
Libraries: Move to Userland/Libraries/
2021-01-12 12:17:46 +01:00
Lenny Maiorani
e6f907a155
AK: Simplify constructors and conversions from nullptr_t
...
Problem:
- Many constructors are defined as `{}` rather than using the ` =
default` compiler-provided constructor.
- Some types provide an implicit conversion operator from `nullptr_t`
instead of requiring the caller to default construct. This violates
the C++ Core Guidelines suggestion to declare single-argument
constructors explicit
(https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#c46-by-default-declare-single-argument-constructors-explicit ).
Solution:
- Change default constructors to use the compiler-provided default
constructor.
- Remove implicit conversion operators from `nullptr_t` and change
usage to enforce type consistency without conversion.
2021-01-12 09:11:45 +01:00
asynts
6fa42af567
Everywhere: Replace a bundle of dbg with dbgln.
...
These changes are arbitrarily divided into multiple commits to make it
easier to find potentially introduced bugs with git bisect.
The modifications in this commit were automatically made using the
following command:
find . -name '*.h' -exec sed -i -E 's/dbg\(\) << ("[^"{]*");/dbgln\(\1\);/' {} \;
2021-01-11 21:49:29 +01:00
Sahan Fernando
fe2b8906d4
Everywhere: Fix incorrect uses of String::format and StringBuilder::appendf
...
These changes are arbitrarily divided into multiple commits to make it
easier to find potentially introduced bugs with git bisect.
2021-01-11 21:06:32 +01:00