Negate was incorrectly returning "== 0", a copy/paste from EqualsZero.
This patch corrects it to return "neg", matching the operator's actual
semantics and WebAssembly mnemonics (f32.neg, f64.neg).
This, along with moving the sources and destination out of the config
object, makes it so we don't have to double-deref to get to them on each
instruction, leading to a ~15% perf improvement on dispatch.
Namely, find an upper bound at validation time so we can allocate the
space when entering the frame.
Also drop labels at once instead of popping them off one at a time now
that we're using a Vector.
This still passes the values on the stack, but registers are now allowed
to cross a call boundary.
This is a very significant (>50%) improvement on the small call
microbenchmarks on my machine.
Largely combinations of i32.const and local.get.
This shaves off at most single-digit% number of instructions from
dispatch, which translates to at most ~10% reduced dispatch time.
Across most benchmarks, this gains around ~5% perf increase.
This commit adds a register allocator, with 8 available "register"
slots.
In testing with various random blobs, this moves anywhere from 30% to
74% of value accesses into predefined slots, and is about a ~20% perf
increase end-to-end.
To actually make this usable, a few structural changes were also made:
- we no longer do one instruction per interpret call
- trapping is an (unlikely) exit condition
- the label and frame stacks are replaced with linked lists with a huge
node cache size, as we only need to touch the last element and
push/pop is very frequent.
This is largely unused (only in wasm.cpp)
A future reimplementation can bring it back as a separate interpreter
class that embeds the current bytecode interpreter.
The average wasm function rarely goes over these bounds for the labels
(32 nested control structures), and 8 frames is just enough to clear
most initialization code/start section without allocating anything.
If the address is already aligned properly, just read a T from it;
otherwise copy it to a local aligned array. This was a bottleneck on
memory-heavy benchmarks.
Instead of returning whichever argument was NaN, return the canonical
NaN instead. The spec allows the old behavior:
"Following the recommendation that operators propagate NaN payloads
from their operands is permitted but not required."
But Chrome, Firefox and Safari do not propagate the operand payloads.
Fixes 448 WPT subtests in `wasm/core`.
Co-authored-by: Ali Mohammad Pur <ali.mpfard@gmail.com>
The spec tells us to reject the promise with a RuntimeError instead of a
LinkError whenever the module's start function fails during module
instantiation. Fixes 1 WPT subtest in `wasm/core`.
...instead of specially handling JS::Completion.
This makes it possible for LibWeb/LibJS to have full control over how
these things are made, stored, and visited (whenever).
Fixes an issue where we couldn't roundtrip a JS exception through Wasm.