Commit graph

507 commits

Author SHA1 Message Date
Timothy Flynn
1bcc29d0d1 LibJS+LibLocale: Replace Unicode keyword lookups with ICU
Note: All of the changes to the test files in this patch are now aligned
with both Chrome and Firefox.
2024-06-16 06:57:08 +02:00
Timothy Flynn
a1464342e1 LibJS+LibLocale: Remove unused parameter from keyword canonicalization 2024-06-16 06:57:08 +02:00
Timothy Flynn
5e2ee4447e LibJS+LibLocale: Replace plural rules selection with ICU
This uses ICU for all of the Intl.PluralRules prototypes, which lets us
remove all data from our plural rules generator.

Plural rules depend directly on internal data from the number formatter,
so rather than creating a separate Locale::PluralRules class (which will
make accessing that data awkward), this adds plural rules APIs to the
existing Locale::NumberFormat.
2024-06-15 06:57:16 +02:00
Timothy Flynn
7f9ccd39f5 LibJS+LibLocale: Replace relative time formatting with ICU
This uses ICU for all of the Intl.RelativeTimeFormat prototypes, which
lets us remove all data from our relative-time format generator.
2024-06-15 06:57:16 +02:00
Timothy Flynn
d634039c10 LibJS: Implement the latest Intl.DurationFormat proposal
The proposal has undergone quite a few normative changes since we last
synced with it. There was a time when it could not be implemented as it
was written, which is no longer the case. The resulting proposal has had
so many changes compared to our implementation, that it wouldn't make
sense to implement them commit-by-commit as we normally do. So instead,
this just implements the HEAD revision of the spec in one pass.
2024-06-14 07:59:42 +02:00
Timothy Flynn
4b3e26c583 LibJS+LibLocale: Replace calendar weekday information with ICU 2024-06-13 07:42:09 +02:00
Timothy Flynn
9cb1857dc6 LibJS+LibLocale: Replace preferred hour cycle lookups with ICU 2024-06-13 07:42:09 +02:00
Timothy Flynn
273694d8de LibJS+LibLocale: Replace date-time formatting with ICU
This uses ICU for the Intl.DateTimeFormat `format` `formatToParts`,
`formatRange`, and `formatRangeToParts`.

This lets us remove most data from our date-time format generator. All
that remains are time zone data and locale week info, which are relied
upon still for other interfaces. So they will be removed in a future
patch.

Note: All of the changes to the test files in this patch are now aligned
with other browsers. This includes:

* Some very incorrect formatting of Japanese symbols. (Looking at the
  old results now, it's very obvious they were wrong.)
* Old FIXMEs regarding range formatting not including the start/end date
  when only time fields were requested, but the dates differ.
* Day period inconsistencies.
2024-06-13 07:42:09 +02:00
Timothy Flynn
3b68bb6e73 LibJS: Store Intl mathematical values as strings when appropriate
The IntlMV is meant to be arbitrarily precise. If the user provides a
string value to be formatted, we lose precision by converting extremely
large values to a double. We were never able to address this, as support
for arbitrary precision was a big FIXME. But ICU can handle it by just
passing the raw string on through.
2024-06-10 13:51:51 +02:00
Timothy Flynn
f6bee0f5a8 LibJS+LibLocale: Replace number range formatting with ICU
This uses ICU for the Intl.NumberFormat `formatRange` and
`formatRangeToParts` prototypes.

Note: All of the changes to the test files in this patch are now aligned
with both Chrome and Safari.
2024-06-10 13:51:51 +02:00
Timothy Flynn
67f3de2320 LibJS+LibLocale: Begin replacing number formatting with ICU
This uses ICU for the Intl.NumberFormat `format` and `formatToParts`
prototypes. It does not yet port the range formatter prototypes.

Most of the new code in LibLocale/NumberFormat is simply mapping from
ECMA-402 types to ICU types. Beyond that, the only algorithmic change is
that we have to mutate the output from ICU for `formatToParts` to match
what is expected by ECMA-402. This is explained in NumberFormat.cpp in
`flatten_partitions`.

This lets us remove most data from our number format generator. All that
remains are numbering system digits and symbols, which are relied upon
still for other interfaces (e.g. Intl.DateTimeFormat). So they will be
removed in a future patch.

Note: All of the changes to the test files in this patch are now aligned
with both Chrome and Safari.
2024-06-10 13:51:51 +02:00
Timothy Flynn
5f7251fd91 LibJS+LibLocale: Replace list formatting with ICU
This also largely eliminates the need for some ECMA-402 AOs, as is it
all handled internally by ICU (which the spec is basically based on).
2024-06-09 10:47:28 +02:00
Timothy Flynn
d17d131224 LibJS+LibLocale: Replace locale character ordering with ICU 2024-06-09 10:47:28 +02:00
Timothy Flynn
e487f91388 LibJS+LibLocale: Replace locale maximization and minimization with ICU 2024-06-09 10:47:28 +02:00
Timothy Flynn
9724a25daf LibJS+LibLocale: Replace canonical locales and display names with ICU
Note: We keep locale parsing and syntactic validation as-is. ECMA-402
places additional restrictions on locales above what is required by the
Unicode spec. ICU doesn't provide methods that let us easily check those
restrictions, whereas LibLocale does. Other browsers also implement
their own validators here.

This introduces a locale cache to re-use parsed locale data and various
related structures (not doing so has a non-negligible performance impact
on Intl tests).

The existing APIs for canonicalization and display names are pretty
intertwined, so they must both be adapted at once here. The results of
canonicalization are slightly different on some edge cases. But the
changed results are actually now aligned with Chrome and Safari.
2024-06-09 10:47:28 +02:00
Timothy Flynn
ec492a1a08 Everywhere: Run clang-format
The following command was used to clang-format these files:

    clang-format-18 -i $(find . \
        -not \( -path "./\.*" -prune \) \
        -not \( -path "./Base/*" -prune \) \
        -not \( -path "./Build/*" -prune \) \
        -not \( -path "./Toolchain/*" -prune \) \
        -not \( -path "./Ports/*" -prune \) \
        -type f -name "*.cpp" -o -name "*.mm" -o -name "*.h")

There are a couple of weird cases where clang-format now thinks that a
pointer access in an initializer list, e.g. `m_member(ptr->foo)`, is a
lambda return statement, and it puts spaces around the `->`.
2024-04-24 16:50:01 -04:00
Timothy Flynn
0d3072bdac LibJS: Use IteratorStepValue in ECMA-402
This is an editorial change in the ECMA-402 spec. See:
e295500
2024-02-03 14:07:26 -05:00
Shannon Booth
986abe7047 LibJS: Rename IntlNumberIsNaNOrInfinity to NumberIsNaNOrInfinity
While only currently used in Intl in LibJS, this is a pretty generic
error and is useful elsewhere. Rename it to something more generic.
2024-01-02 10:01:26 +01:00
Andreas Kling
f4fa37afd2 LibJS+LibWeb: Add missing JS_DEFINE_ALLOCATOR() for a bunch of classes 2023-12-23 23:02:10 +01:00
Shannon Booth
e2e7c4d574 Everywhere: Use to_number<T> instead of to_{int,uint,float,double}
In a bunch of cases, this actually ends up simplifying the code as
to_number will handle something such as:

```
Optional<I> opt;
if constexpr (IsSigned<I>)
    opt = view.to_int<I>();
else
    opt = view.to_uint<I>();
```

For us.

The main goal here however is to have a single generic number conversion
API between all of the String classes.
2023-12-23 20:41:07 +01:00
Ali Mohammad Pur
5e1499d104 Everywhere: Rename {Deprecated => Byte}String
This commit un-deprecates DeprecatedString, and repurposes it as a byte
string.
As the null state has already been removed, there are no other
particularly hairy blockers in repurposing this type as a byte string
(what it _really_ is).

This commit is auto-generated:
  $ xs=$(ack -l \bDeprecatedString\b\|deprecated_string AK Userland \
    Meta Ports Ladybird Tests Kernel)
  $ perl -pie 's/\bDeprecatedString\b/ByteString/g;
    s/deprecated_string/byte_string/g' $xs
  $ clang-format --style=file -i \
    $(git diff --name-only | grep \.cpp\|\.h)
  $ gn format $(git ls-files '*.gn' '*.gni')
2023-12-17 18:25:10 +03:30
Andreas Kling
3c74dc9f4d LibJS: Segregate GC-allocated objects by type
This patch adds two macros to declare per-type allocators:

- JS_DECLARE_ALLOCATOR(TypeName)
- JS_DEFINE_ALLOCATOR(TypeName)

When used, they add a type-specific CellAllocator that the Heap will
delegate allocation requests to.

The result of this is that GC objects of the same type always end up
within the same HeapBlock, drastically reducing the ability to perform
type confusion attacks.

It also improves HeapBlock utilization, since each block now has cells
sized exactly to the type used within that block. (Previously we only
had a handful of block sizes available, and most GC allocations ended
up with a large amount of slack in their tails.)

There is a small performance hit from this, but I'm sure we can make
up for it elsewhere.

Note that the old size-based allocators still exist, and we fall back
to them for any type that doesn't have its own CellAllocator.
2023-11-19 12:10:31 +01:00
Timothy Flynn
1d76738dde LibJS: Change Intl.Locale info APIs from property getters to methods
This is a normative change in the Intl Locale Info spec. See:
e550152
2023-11-13 20:10:58 +01:00
Timothy Flynn
a357874c77 LibJS: Implement Intl.Locale.prototype.firstDayOfWeek
This is a normative change in the Intl Locale Info spec. See:
f03a814
2023-11-13 20:10:58 +01:00
Daniel Bertalan
6f972c190b Everywhere: Work around Clang trunk bug with templated lambda + Variant
Since 2023-09-08, Clang trunk has had a bug which causes a segfault when
evaluating certain `requires` expressions inside templated lambdas.
There isn't an imminent fix on the horizon, so let's work around the
issue by specifying the type of the offending lambda arguments
explicitly.

See https://github.com/llvm/llvm-project/issues/67260
2023-11-05 13:41:13 -07:00
Andreas Kling
65717e3b75 LibJS: Inline fast case for Value::to_{boolean,number,numeric,primitive}
These functions all have a very common case that can be dealt with a
very simple inline check, often avoiding the need to call an out-of-line
function. This patch moves the common case to inline functions in a new
ValueInlines.h header (necessary due to header dependency issues..)

8% speed-up on the entire Kraken benchmark :^)
2023-10-07 07:13:52 +02:00
Timothy Flynn
03be26317f LibJS: Alphabetize handling some Intl.NumberFormat/PluralRules options
This is a normative change in the ECMA-402 spec. See:
5a43090
2023-10-05 17:01:02 +02:00
Timothy Flynn
05e080c4ba LibJS: Correctly resolve locale hour cycles in Intl.DateTimeFormat
This is a normative change in the ECMA-402 spec. See:
2f002b2
2023-10-05 17:01:02 +02:00
Timothy Flynn
39be5cb73a LibJS: Allow formatting UTC-offset time zones with Intl.DateTimeFormat
These are normative changes in the ECMA-402 spec. See:
896ffcc
af4ec46
e25c455

(This combines the above commits into one patch as they each do not work
on their own).
2023-10-05 17:01:02 +02:00
Timothy Flynn
f31540e419 LibJS: Implement time zone identifier AOs centrally within Date
This is an editorial change in the ECMA-262 spec. See:
73926a5

The idea here is to reduce duplication of these AOs between ECMA-262,
ECMA-402, and Temporal. This patch contains only the ECMA-262 changes.
2023-10-05 17:01:02 +02:00
Timothy Flynn
0bc401a1d6 LibTimeZone+Userland: Include Link entries when returning all time zones
We currently only return primary time zones, i.e. time zones that are
not a Link. LibJS will require knowledge of Link entries, and whether
each entry is or is not a Link.
2023-10-05 17:01:02 +02:00
Timothy Flynn
b6835d2c40 LibJS: Stop propagating small OOM errors from Intl.RelativeTimeFormat 2023-09-05 08:08:09 +02:00
Timothy Flynn
b3694653a7 LibJS: Stop propagating small OOM errors from Intl.NumberFormat
Note this also does the same for Intl.PluralRules. The only OOM errors
propagated from Intl.PluralRules were from Intl.NumberFormat.
2023-09-05 08:08:09 +02:00
Timothy Flynn
30a812b77b LibJS: Stop propagating small OOM errors from Intl.MathematicalValue 2023-09-05 08:08:09 +02:00
Timothy Flynn
746ce6f9a1 LibJS: Stop propagating small OOM errors from Intl.Locale 2023-09-05 08:08:09 +02:00
Timothy Flynn
9e5055c298 LibJS: Stop propagating small OOM errors from Intl.ListFormat 2023-09-05 08:08:09 +02:00
Timothy Flynn
76b5974f08 LibJS: Stop propagating small OOM errors from the Intl namespace object 2023-09-05 08:08:09 +02:00
Timothy Flynn
20aaa2c236 LibJS: Stop propagating small OOM errors from Intl.DurationFormat 2023-09-05 08:08:09 +02:00
Timothy Flynn
b78cbf88db LibJS: Stop propagating small OOM errors from Intl.DateTimeFormat 2023-09-05 08:08:09 +02:00
Timothy Flynn
1708c1fdfe LibJS: Stop propagating small OOM errors from Intl.Collator 2023-09-05 08:08:09 +02:00
Timothy Flynn
b6ff25bd26 LibJS: Stop propagating small OOM errors from Intl abstract operations 2023-09-05 08:08:09 +02:00
Timothy Flynn
ca0d926036 LibJS: Use decimal compact patterns for currency style sub-patterns
When formatting a currency style pattern with compact notation, we were
(trying to) doubly insert the currency symbol into the formatted string.
We would first look up the currency pattern in GetNumberFormatPattern
(for the en locale, this is "¤#,##0.00", which our generator transforms
to "{currency}{number}").

When we hit the "{number}" field, NumberFormat will do a second lookup
for the compact pattern to use for the number being formatted. By using
the currency compact patterns, we receive a second pattern that also has
the currency symbol (for the en locale, if formatting the number 1000,
this is "¤0K", which our generator transforms to
"{currency}{number}{compactIdentifier:0}". This second lookup is not
supposed to have currency symbols (or any other symbols), thus we hit a
VERIFY_NOT_REACHED().

Instead, we are meant to use the decimal compact pattern, and allow the
currency symbol to be handled by only the outer currency pattern.
2023-09-04 18:22:28 +02:00
Timothy Flynn
eb8f7b303c LibLocale+LibJS: Make relative time format APIs infallible
These APIs only perform small allocations, and are only used by LibJS.
Callers which could only have failed from these APIs are also made to
be infallible here.
2023-08-23 05:29:21 +02:00
Timothy Flynn
7536648498 LibLocale+LibJS+ClockSettings: Make date time format APIs infallible
These APIs only perform small allocations, and are only used by LibJS
and the time zone settings widget. Callers which could only have failed
from these APIs are also made to be infallible here.
2023-08-23 05:29:21 +02:00
Timothy Flynn
0914e86691 LibLocale+LibJS: Make number format APIs infallible
These APIs only perform small allocations, and are only used by LibJS.
Callers which could only have failed from these APIs are also made to
be infallible here.
2023-08-23 05:29:21 +02:00
Timothy Flynn
cd526813e6 LibLocale+LibJS: Make locale data APIs infallible
These APIs only perform small allocations, and are only used by LibJS.
Callers which could only have failed from these APIs are also made to
be infallible here.
2023-08-23 05:29:21 +02:00
Timothy Flynn
341626e2ea LibJS: Reorder NumberFormat's rounding priority resolved option
This is a normative change in the ECMA-402 spec. See:
0fbf16c
2023-08-14 07:48:54 -04:00
Timothy Flynn
b0c8543b28 LibJS: Compute NumberFormat's rounding priority during construction
This is an editorial change in the ECMA-402 spec. See:
c28118e
2023-08-14 07:48:54 -04:00
Andreas Kling
72c9f56c66 LibJS: Make Heap::allocate<T>() infallible
Stop worrying about tiny OOMs. Work towards #20449.

While going through these, I also changed the function signature in many
places where returning ThrowCompletionOr<T> is no longer necessary.
2023-08-13 15:38:42 +02:00
Andreas Kling
b8f78c0adc LibJS: Make JS::number_to_string() infallible
Work towards #20449.
2023-08-09 17:09:16 +02:00