This allows us to remove the BoundFunction::m_name field, which we
were initializing with a formatted FlyString on every function binding,
despite never using it for anything.
We were doing way too much computation every time an ESFO was
instantiated. This was particularly sad, since the results of these
computations were identical every time!
This patch adds a new SharedFunctionInstanceData object that gets
shared between all instances of an ESFO instantiated from some kind of
AST FunctionNode.
~5% speedup on Speedometer 2.1 :^)
We cached the length identifier for GetLength, but not
GetLengthWithThis. This caused an `has_value()` verification failure
when accessing super.length. Found by Fuzzilli.
This is a slightly mystified attempt to recover the performance
regression seen on our JS benchmark runner after 3c2a2bb39f
and c7bba505ea.
With this change, c7bba505ea is effectively reverted from the
perspective of x86_64.
It seems both aarch64 and x86_64 are extremely sensitive to the use of
bitfields here. Unfortunately, aarch64 gains a huge speedup from them
while x86_64 sees a very noticeable slowdown.
Since we're talking about 5%+ swings in both directions here, let's go
for the best of both worlds and use ifdefs in the Operand memory layout.
Before this change, we were inlining this function after every
handler for instructions that could throw.
Forcing it out-of-line shrinks the main bytecode interpreter by 15%
and yields a decent 2.5% speedup on JetStream/gcc-loops.cpp.js
This packs the bytecode much better and gives us a decent performance
boost on throughput-focused benchmarks.
Measured on my M3 MacBook Pro:
- 4.7% speedup on Kraken
- 2.3% speedup on Octane
- 2.7% speedup on JetStream1
We can use the index's invalid state to signal an empty optional.
This makes Optional<StringTableIndex> 4 bytes instead of 8,
shrinking every bytecode instruction that uses these.
The special empty value (that we use for array holes, Optional<Value>
when empty and a few other other placeholder/sentinel tasks) still
exists, but you now create one via JS::js_special_empty_value() and
check for it with Value::is_special_empty_value().
The main idea here is to make it very unlikely to accidentally create an
unexpected special empty value.
Basically convert o["foo"]=x into o.foo=x when emitting bytecode.
These are effectively the same thing, and the latter format opts
into using an inline cache for the property lookups.
Basically convert o["foo"] into o.foo when emitting bytecode. These are
effectively the same thing, and the latter format opts into using an
inline cache for the property lookups.
Instead of returning internal generator results as ordinary JS::Objects
with properties, we now use GeneratorResult and CompletionCell which
both inherit from Cell directly and allow efficient access to state.
1.59x speedup on JetStream3/lazy-collections.js :^)
We can use the index's invalid state to signal an empty optional.
This makes Optional<IdentifierTableIndex> 4 bytes instead of 8,
shrinking every bytecode instruction that uses these.
Instead of making a copy of the Vector<FunctionParameter> from the AST
every time we instantiate an ECMAScriptFunctionObject, we now keep the
parameters in a ref-counted FunctionParameters object.
This reduces memory usage, and also allows us to cache the bytecode
executables for default parameter expressions without recompiling them
for every instantiation. :^)
This works because at the end of the finally chunk, a
ContinuePendingUnwind is generated which copies the saved return value
register into the return value register. In cases where
ContinuePendingUnwind is not generated such as when there is a break
statement in the finally block, the fonction will return undefined which
is consistent with V8 and SpiderMonkey.
This avoids emitting TDZ checks for multiple bindings declared and
referenced within one variable declaration, i.e:
var foo = 0, bar = foo;
In the above case, we'd emit an unnecessary TDZ check for the second
reference to `foo`.
Before this change, we would enumerate all the keys with
[[OwnPropertyKeys]], and then do [[GetOwnPropertyDescriptor]] twice for
each key as we went through them.
We now only do one [[GetOwnPropertyDescriptor]] per key, which
drastically reduces the number of proxy traps when those are involved.
The new trap sequence matches what you get with V8, so I don't think
anyone will be unpleasantly surprised here.
If all the parameter default values end up in locals, the lexical
environment we create to hold them would never be used for anything,
and so we can elide it and avoid the GC work.