Commit graph

37 commits

Author SHA1 Message Date
Timothy Flynn
a43cb15e81 LibJS+LibWeb: Replace JS::Utf16String with AK::Utf16String 2025-07-18 12:45:38 -04:00
Timothy Flynn
2803d66d87 AK: Support UTF-16 string formatting
The underlying storage used during string formatting is StringBuilder.
To support UTF-16 strings, this patch allows callers to specify a mode
during StringBuilder construction. The default mode is UTF-8, for which
StringBuilder remains unchanged.

In UTF-16 mode, we treat the StringBuilder's internal ByteBuffer as a
series of u16 code units. Appending a single character will append 2
bytes for that character (cast to a char16_t). Appending a StringView
will transcode the string to UTF-16.

Utf16String also gains the same memory optimization that we added for
String, where we hand-off the underlying buffer to Utf16String to avoid
having to re-allocate.

In the future, we may want to further optimize for ASCII strings. For
example, we could defer committing to the u16-esque storage until we
see a non-ASCII code point.
2025-07-18 12:45:38 -04:00
Timothy Flynn
fe676585f5 AK: Add a UTF-16 string with optimized short- and ASCII-string storage
This is a strictly UTF-16 string with some optimizations for ASCII.

* If created from a short UTF-8 or UTF-16 string that is also ASCII,
  then the string is stored in an inlined byte buffer.

* If created with a long UTF-8 or UTF-16 string that is also ASCII,
  then the string is stored in an outlined char buffer.

* If created with a short or long UTF-8 or UTF-16 string that is not
  ASCII, then the string is stored in an outlined char16 buffer.

We do not store short non-ASCII text in the inlined buffer to avoid
confusion with operations such as `length_in_code_units` and
`code_unit_at`. For example, "😀" would be stored as 4 UTF-8 bytes
in short string form. But we still want `length_in_code_units` to
be 2, and `code_unit_at(0)` to be 0xD83D.
2025-07-18 12:45:38 -04:00
Luke Wilde
3d43462ccd LibJS: Implement the Dynamic Code Brand Checks stage 3 proposal
This is an active proposal at stage 3 of the TC39 proposal process.
See: https://tc39.es/proposal-dynamic-code-brand-checks/
See: https://github.com/tc39/proposal-dynamic-code-brand-checks

This proposal essentially adds support for the TrustedScript type from
the Trusted Types specification to eval and Function. This in turn
pipes support for the type into the CSP hook to check if the CSP allows
dynamic code compilation.

However, it currently doesn't support ShadowRealms, so the
implementation here is a close approximation, using PerformEval as the
basis.
See: https://github.com/tc39/proposal-dynamic-code-brand-checks/issues/19

This is required to support the new function signature for the CSP
hook, and will allow us to slot in Trusted Types support in the future.
2025-07-09 15:52:54 -06:00
Timothy Flynn
62d9a84b8d AK+Everywhere: Replace custom number parsers with fast_float
Some checks failed
CI / macOS, arm64, Sanitizer_CI, Clang (push) Waiting to run
CI / Linux, x86_64, Fuzzers_CI, Clang (push) Waiting to run
CI / Linux, x86_64, Sanitizer_CI, GNU (push) Waiting to run
CI / Linux, x86_64, Sanitizer_CI, Clang (push) Waiting to run
Package the js repl as a binary artifact / Linux, arm64 (push) Waiting to run
Package the js repl as a binary artifact / macOS, arm64 (push) Waiting to run
Package the js repl as a binary artifact / Linux, x86_64 (push) Waiting to run
Run test262 and test-wasm / run_and_update_results (push) Waiting to run
Lint Code / lint (push) Waiting to run
Label PRs with merge conflicts / auto-labeler (push) Waiting to run
Push notes / build (push) Waiting to run
Build Dev Container Image / build (push) Has been cancelled
Our floating point number parser was based on the fast_float library:
https://github.com/fastfloat/fast_float

However, our implementation only supports 8-bit characters. To support
UTF-16, we will need to be able to convert char16_t-based strings to
numbers as well. This works out-of-the-box with fast_float.

We can also use fast_float for integer parsing.
2025-07-03 09:51:56 -04:00
Timothy Flynn
9fc3e72db2 AK+Everywhere: Allow lonely UTF-16 surrogates by default
By definition, the web allows lonely surrogates by default. Let's have
our string APIs reflect this, so we don't have to pass an allow option
all over the place.
2025-07-03 09:51:56 -04:00
Timothy Flynn
86b1c78c1a AK+Everywhere: Prepare Utf16View for integration with a UTF-16 string
To prepare for an upcoming Utf16String, this migrates Utf16View to store
its data as a char16_t. Most function definitions are moved inline and
made constexpr.

This also adds a UDL to construct a Utf16View from a string literal:

    auto string = u"hello"sv;

This let's us remove the NTTP Utf16View constructor, as we have found
that such constructors bloat binary size quite a bit.
2025-07-03 09:51:56 -04:00
Andreas Kling
a0864dbb26 LibJS: Make mapped arguments objects way less allocation-happy
Some checks are pending
CI / Lagom (arm64, Sanitizer_CI, false, macos-15, macOS, Clang) (push) Waiting to run
CI / Lagom (x86_64, Fuzzers_CI, false, ubuntu-24.04, Linux, Clang) (push) Waiting to run
CI / Lagom (x86_64, Sanitizer_CI, false, ubuntu-24.04, Linux, GNU) (push) Waiting to run
CI / Lagom (x86_64, Sanitizer_CI, true, ubuntu-24.04, Linux, Clang) (push) Waiting to run
Package the js repl as a binary artifact / build-and-package (arm64, macos-15, macOS, macOS-universal2) (push) Waiting to run
Package the js repl as a binary artifact / build-and-package (x86_64, ubuntu-24.04, Linux, Linux-x86_64) (push) Waiting to run
Run test262 and test-wasm / run_and_update_results (push) Waiting to run
Lint Code / lint (push) Waiting to run
Label PRs with merge conflicts / auto-labeler (push) Waiting to run
Push notes / build (push) Waiting to run
By following the spec to the letter, our mapped arguments objects ended
up with many extra GC allocations:

- 1 extra Object for the internal [[ParameterMap]].
- 2 extra NativeFunctions for each mapped parameter accessor.
- 1 extra Accessor to hold the aforementioned NativeFunctions.

This patch removes all those allocations and lets ArgumentsObject model
the desired behavior in custom C++ instead of using script primitives.

1.06x speedup on Speedometer's TodoMVC-jQuery.
2025-05-11 14:00:40 +02:00
Timothy Flynn
3867a192a1 LibJS: Update spec steps / links for the import-assertions proposal
This proposal has reached stage 4 and been merged into the main ECMA-262
spec. See:

4e3450e
2025-04-29 07:33:08 -04:00
Andreas Kling
a05be67e4a LibJS: Let invokers (callers) of [[Call]] allocate ExecutionContext
Some checks are pending
CI / Lagom (arm64, Sanitizer_CI, false, macos-15, macOS, Clang) (push) Waiting to run
CI / Lagom (x86_64, Fuzzers_CI, false, ubuntu-24.04, Linux, Clang) (push) Waiting to run
CI / Lagom (x86_64, Sanitizer_CI, false, ubuntu-24.04, Linux, GNU) (push) Waiting to run
CI / Lagom (x86_64, Sanitizer_CI, true, ubuntu-24.04, Linux, Clang) (push) Waiting to run
Package the js repl as a binary artifact / build-and-package (arm64, macos-15, macOS, macOS-universal2) (push) Waiting to run
Package the js repl as a binary artifact / build-and-package (x86_64, ubuntu-24.04, Linux, Linux-x86_64) (push) Waiting to run
Run test262 and test-wasm / run_and_update_results (push) Waiting to run
Lint Code / lint (push) Waiting to run
Label PRs with merge conflicts / auto-labeler (push) Waiting to run
Push notes / build (push) Waiting to run
Instead of letting every [[Call]] implementation allocate an
ExecutionContext, we now make that a responsibility of the caller.

The main point of this exercise is to allow the Call instruction
to write function arguments directly into the callee ExecutionContext
instead of copying them later.

This makes function calls significantly faster:
- 10-20% faster on micro-benchmarks (depending on argument count)
- 4% speedup on Kraken
- 2% speedup on Octane
- 5% speedup on JetStream
2025-04-28 01:23:56 +02:00
Andrew Kaster
9bae24cc4a LibJS: Add and use ValidateNonRevokedProxy AO
This refactor is from two editorial changes to the spec from a while
back.

44d1cae2b2
21ffeee869
2025-04-24 10:37:39 +02:00
Aliaksandr Kalenik
a329868c1b LibJS: Allocate ExecutionContext memory using alloca() when possible
This should be faster than heap allocation. However, heap allocation is
still necessary in some cases, such as with generators and async
functions.
2025-04-24 10:30:52 +02:00
Aliaksandr Kalenik
c6cd03d7ca LibJS+LibWeb: Join arguments into vector of registers+constants+locals
This is better because:
- Better data locality
- Allocate vector for registers+constants+locals+arguments in one go
  instead of allocating two vectors separately
2025-04-24 10:30:52 +02:00
Aliaksandr Kalenik
80a8040794 LibJS+LibWeb: Calculate count of regs+consts+locals before EC allocation
This is a preparation step before joining arguments vector into vector
of registers+constants+locals.
2025-04-24 10:30:52 +02:00
Andreas Kling
669b1131ad LibJS: Streamline CreateMappedArgumentsObject [[ParameterMap]] creation
Instead of using the more generic define_native_accessor() here,
we poke directly at indexed property storage for the parameter map.

We can also construct the NativeFunction objects directly, without
giving them names like "get 0" etc, since these are not observable
by userspace.

Finally, by using default property attributes (not observable anyway),
we get simple indexed storage instead of generic (hash map) storage.
2025-04-20 18:43:11 +02:00
Andreas Kling
e0b32b1863 LibJS: Use premade shape when creating mapped arguments objects
Knocks out a 0.4% profile item on Speedometer 3.
2025-04-19 01:14:02 +02:00
Andreas Kling
e8c351505e LibJS: Use premade shape when creating unmapped arguments objects
Takes Speedometer2.1/EmberJS-Debug-TodoMVC from ~4500ms to ~4000ms
on my M3 MacBook Pro.
2025-04-15 13:08:27 +02:00
Andreas Kling
2a9b6f1d97 LibJS: Move computation out of the ECMAScriptFunctionObject constructor
We were doing way too much computation every time an ESFO was
instantiated. This was particularly sad, since the results of these
computations were identical every time!

This patch adds a new SharedFunctionInstanceData object that gets
shared between all instances of an ESFO instantiated from some kind of
AST FunctionNode.

~5% speedup on Speedometer 2.1 :^)
2025-04-08 18:52:35 +02:00
Andreas Kling
4209b18b88 LibJS: Add ECMAScriptFunctionObject::create_from_function_node() helper
This gives us a shared entry point for every situation where we
instantiate a function based on a FunctionNode from the AST.
2025-04-08 18:52:35 +02:00
Andreas Kling
3cf50539ec LibJS: Make Value() default-construct the undefined value
The special empty value (that we use for array holes, Optional<Value>
when empty and a few other other placeholder/sentinel tasks) still
exists, but you now create one via JS::js_special_empty_value() and
check for it with Value::is_special_empty_value().

The main idea here is to make it very unlikely to accidentally create an
unexpected special empty value.
2025-04-05 11:20:26 +02:00
Andreas Kling
de424d6879 LibJS: Make Completion.[[Value]] non-optional
Instead, just use js_undefined() whenever the [[Value]] field is unused.
This avoids a whole bunch of presence checks.
2025-04-05 11:20:26 +02:00
Andreas Kling
c71772126f LibJS: Remove ByteString internals from PrimitiveString
PrimitiveString is now internally either UTF-8, UTF-16, or both.
We no longer convert them to/from ByteString anywhere, nor does VM have
a ByteString cache.
2025-03-28 12:31:40 -04:00
Andreas Kling
7477002e46 LibJS: Keep parsed function parameters in a shared data structure
Instead of making a copy of the Vector<FunctionParameter> from the AST
every time we instantiate an ECMAScriptFunctionObject, we now keep the
parameters in a ref-counted FunctionParameters object.

This reduces memory usage, and also allows us to cache the bytecode
executables for default parameter expressions without recompiling them
for every instantiation. :^)
2025-03-27 15:00:43 +00:00
Andreas Kling
46a5710238 LibJS: Use FlyString in PropertyKey instead of DeprecatedFlyString
This required dealing with *substantial* fallout.
2025-03-24 22:27:17 +00:00
Andreas Kling
53da8893ac LibJS: Replace PropertyKey(char[]) with PropertyKey(FlyString)
...and deal with the fallout.
2025-03-24 22:27:17 +00:00
Andreas Kling
d7908dbff5 LibJS: Change PropertyKey(ByteString) to PropertyKey(String)
...and deal with the fallout.
2025-03-24 22:27:17 +00:00
Timothy Flynn
a8d6e5c3db LibJS: Migrate Temporal updates to ECMA-262 AOs to the main AO file
These are going to be included in the ECMA-262 AOs once Temporal reaches
stage 4. There's no need to keep them in the Temporal namespace. Some
upcoming Temporal editorial changes will get awkward without this patch.
2025-03-01 14:49:20 +01:00
Timothy Flynn
85b424464a AK+Everywhere: Rename verify_cast to as
Follow-up to fc20e61e72.
2025-01-21 11:34:06 -05:00
Timothy Flynn
b64a355a30 LibJS: Remove support for the "assert" keyword for import attributes
This was removed from the spec some time ago. See:
14286bb
2025-01-21 14:58:32 +01:00
Timothy Flynn
5ea0aa5f08 LibJS: Bring the explicit resource management implementation up to date
While we don't yet have a working `using` implementation with our byte
code, we can still keep our DisposableStack implementation up to date.
The changes brought in here are all editorial, and set us up to start
an AsyncDisposableStack implementation.
2025-01-17 20:46:32 +01:00
Shannon Booth
d48a0aaa55 LibJS: Remove unneeded FIXMEs for suspending an execution context
From what I understand, the suspension steps are not required now,
or in the future for our implementation, or any other. The intent
is already implemented in the spec pushing on another execution
context to the stack and leaving the running execution context as-is.

The resume steps are a slightly different story as there is some subtle
behavior which the spec is trying to convey where some custom logic may
need to be done when one execution context changes from one to another.
It may be worth implementing those steps at a later point in time so
that this behavior is a bit easier to follow in those cases.

To make the situation more confusing - from what I can gather from the
spec, not all cases that the spec mentions resume actually means
anything normative. Resume is only _actually_ needed in a limited set
of locations.

For now, let's just remove the unneeded FIXMEs that indicate that there
is something to be done for the suspension steps, as there is not, and
leave the resume steps as is.
2025-01-02 11:30:04 +01:00
Andreas Kling
3bfb0534be LibGC: Rename MarkedVector => RootVector
Let's try to make it a bit more clear that this is a Vector of GC roots.
2024-12-26 19:10:44 +01:00
Shannon Booth
f87041bf3a LibGC+Everywhere: Factor out a LibGC from LibJS
Resulting in a massive rename across almost everywhere! Alongside the
namespace change, we now have the following names:

 * JS::NonnullGCPtr -> GC::Ref
 * JS::GCPtr -> GC::Ptr
 * JS::HeapFunction -> GC::Function
 * JS::CellImpl -> GC::Cell
 * JS::Handle -> GC::Root
2024-11-15 14:49:20 +01:00
Shannon Booth
1e54003cb1 LibJS+LibWeb: Rename Heap::allocate_without_realm to Heap::allocate
Now that the heap has no knowledge about a JavaScript realm and is
purely for managing the memory of the heap, it does not make sense
to name this function to say that it is a non-realm variant.
2024-11-13 16:51:44 -05:00
Shannon Booth
9b79a686eb LibJS+LibWeb: Use realm.create<T> instead of heap.allocate<T>
The main motivation behind this is to remove JS specifics of the Realm
from the implementation of the Heap.

As a side effect of this change, this is a bit nicer to read than the
previous approach, and in my opinion, also makes it a little more clear
that this method is specific to a JavaScript Realm.
2024-11-13 16:51:44 -05:00
Shannon Booth
e02ca0480f LibJS: Allow unpaired surrogates in String.prototype.replace
This was resulting in a crash for the WPT test case:

https://wpt.live/xhr/send-data-string-invalid-unicode.any.html
2024-11-10 09:14:03 -05:00
Timothy Flynn
93712b24bf Everywhere: Hoist the Libraries folder to the top-level 2024-11-10 12:50:45 +01:00
Renamed from Userland/Libraries/LibJS/Runtime/AbstractOperations.cpp (Browse further)