This uses ICU for all of the Intl.PluralRules prototypes, which lets us
remove all data from our plural rules generator.
Plural rules depend directly on internal data from the number formatter,
so rather than creating a separate Locale::PluralRules class (which will
make accessing that data awkward), this adds plural rules APIs to the
existing Locale::NumberFormat.
The proposal has undergone quite a few normative changes since we last
synced with it. There was a time when it could not be implemented as it
was written, which is no longer the case. The resulting proposal has had
so many changes compared to our implementation, that it wouldn't make
sense to implement them commit-by-commit as we normally do. So instead,
this just implements the HEAD revision of the spec in one pass.
This uses ICU for the Intl.DateTimeFormat `format` `formatToParts`,
`formatRange`, and `formatRangeToParts`.
This lets us remove most data from our date-time format generator. All
that remains are time zone data and locale week info, which are relied
upon still for other interfaces. So they will be removed in a future
patch.
Note: All of the changes to the test files in this patch are now aligned
with other browsers. This includes:
* Some very incorrect formatting of Japanese symbols. (Looking at the
old results now, it's very obvious they were wrong.)
* Old FIXMEs regarding range formatting not including the start/end date
when only time fields were requested, but the dates differ.
* Day period inconsistencies.
The IntlMV is meant to be arbitrarily precise. If the user provides a
string value to be formatted, we lose precision by converting extremely
large values to a double. We were never able to address this, as support
for arbitrary precision was a big FIXME. But ICU can handle it by just
passing the raw string on through.
This uses ICU for the Intl.NumberFormat `formatRange` and
`formatRangeToParts` prototypes.
Note: All of the changes to the test files in this patch are now aligned
with both Chrome and Safari.
This uses ICU for the Intl.NumberFormat `format` and `formatToParts`
prototypes. It does not yet port the range formatter prototypes.
Most of the new code in LibLocale/NumberFormat is simply mapping from
ECMA-402 types to ICU types. Beyond that, the only algorithmic change is
that we have to mutate the output from ICU for `formatToParts` to match
what is expected by ECMA-402. This is explained in NumberFormat.cpp in
`flatten_partitions`.
This lets us remove most data from our number format generator. All that
remains are numbering system digits and symbols, which are relied upon
still for other interfaces (e.g. Intl.DateTimeFormat). So they will be
removed in a future patch.
Note: All of the changes to the test files in this patch are now aligned
with both Chrome and Safari.
Note: We keep locale parsing and syntactic validation as-is. ECMA-402
places additional restrictions on locales above what is required by the
Unicode spec. ICU doesn't provide methods that let us easily check those
restrictions, whereas LibLocale does. Other browsers also implement
their own validators here.
This introduces a locale cache to re-use parsed locale data and various
related structures (not doing so has a non-negligible performance impact
on Intl tests).
The existing APIs for canonicalization and display names are pretty
intertwined, so they must both be adapted at once here. The results of
canonicalization are slightly different on some edge cases. But the
changed results are actually now aligned with Chrome and Safari.
The following command was used to clang-format these files:
clang-format-18 -i $(find . \
-not \( -path "./\.*" -prune \) \
-not \( -path "./Base/*" -prune \) \
-not \( -path "./Build/*" -prune \) \
-not \( -path "./Toolchain/*" -prune \) \
-not \( -path "./Ports/*" -prune \) \
-type f -name "*.cpp" -o -name "*.mm" -o -name "*.h")
There are a couple of weird cases where clang-format now thinks that a
pointer access in an initializer list, e.g. `m_member(ptr->foo)`, is a
lambda return statement, and it puts spaces around the `->`.
In a bunch of cases, this actually ends up simplifying the code as
to_number will handle something such as:
```
Optional<I> opt;
if constexpr (IsSigned<I>)
opt = view.to_int<I>();
else
opt = view.to_uint<I>();
```
For us.
The main goal here however is to have a single generic number conversion
API between all of the String classes.
This commit un-deprecates DeprecatedString, and repurposes it as a byte
string.
As the null state has already been removed, there are no other
particularly hairy blockers in repurposing this type as a byte string
(what it _really_ is).
This commit is auto-generated:
$ xs=$(ack -l \bDeprecatedString\b\|deprecated_string AK Userland \
Meta Ports Ladybird Tests Kernel)
$ perl -pie 's/\bDeprecatedString\b/ByteString/g;
s/deprecated_string/byte_string/g' $xs
$ clang-format --style=file -i \
$(git diff --name-only | grep \.cpp\|\.h)
$ gn format $(git ls-files '*.gn' '*.gni')
This patch adds two macros to declare per-type allocators:
- JS_DECLARE_ALLOCATOR(TypeName)
- JS_DEFINE_ALLOCATOR(TypeName)
When used, they add a type-specific CellAllocator that the Heap will
delegate allocation requests to.
The result of this is that GC objects of the same type always end up
within the same HeapBlock, drastically reducing the ability to perform
type confusion attacks.
It also improves HeapBlock utilization, since each block now has cells
sized exactly to the type used within that block. (Previously we only
had a handful of block sizes available, and most GC allocations ended
up with a large amount of slack in their tails.)
There is a small performance hit from this, but I'm sure we can make
up for it elsewhere.
Note that the old size-based allocators still exist, and we fall back
to them for any type that doesn't have its own CellAllocator.
Since 2023-09-08, Clang trunk has had a bug which causes a segfault when
evaluating certain `requires` expressions inside templated lambdas.
There isn't an imminent fix on the horizon, so let's work around the
issue by specifying the type of the offending lambda arguments
explicitly.
See https://github.com/llvm/llvm-project/issues/67260
These functions all have a very common case that can be dealt with a
very simple inline check, often avoiding the need to call an out-of-line
function. This patch moves the common case to inline functions in a new
ValueInlines.h header (necessary due to header dependency issues..)
8% speed-up on the entire Kraken benchmark :^)
These are normative changes in the ECMA-402 spec. See:
896ffccaf4ec46e25c455
(This combines the above commits into one patch as they each do not work
on their own).
This is an editorial change in the ECMA-262 spec. See:
73926a5
The idea here is to reduce duplication of these AOs between ECMA-262,
ECMA-402, and Temporal. This patch contains only the ECMA-262 changes.
We currently only return primary time zones, i.e. time zones that are
not a Link. LibJS will require knowledge of Link entries, and whether
each entry is or is not a Link.
When formatting a currency style pattern with compact notation, we were
(trying to) doubly insert the currency symbol into the formatted string.
We would first look up the currency pattern in GetNumberFormatPattern
(for the en locale, this is "¤#,##0.00", which our generator transforms
to "{currency}{number}").
When we hit the "{number}" field, NumberFormat will do a second lookup
for the compact pattern to use for the number being formatted. By using
the currency compact patterns, we receive a second pattern that also has
the currency symbol (for the en locale, if formatting the number 1000,
this is "¤0K", which our generator transforms to
"{currency}{number}{compactIdentifier:0}". This second lookup is not
supposed to have currency symbols (or any other symbols), thus we hit a
VERIFY_NOT_REACHED().
Instead, we are meant to use the decimal compact pattern, and allow the
currency symbol to be handled by only the outer currency pattern.
These APIs only perform small allocations, and are only used by LibJS.
Callers which could only have failed from these APIs are also made to
be infallible here.
These APIs only perform small allocations, and are only used by LibJS
and the time zone settings widget. Callers which could only have failed
from these APIs are also made to be infallible here.
These APIs only perform small allocations, and are only used by LibJS.
Callers which could only have failed from these APIs are also made to
be infallible here.
These APIs only perform small allocations, and are only used by LibJS.
Callers which could only have failed from these APIs are also made to
be infallible here.
Stop worrying about tiny OOMs. Work towards #20449.
While going through these, I also changed the function signature in many
places where returning ThrowCompletionOr<T> is no longer necessary.