LibLocale was split off from LibUnicode a couple years ago to reduce the
number of applications on SerenityOS that depend on CLDR data. Now that
we use ICU, both LibUnicode and LibLocale are actually linking in this
data. And since vcpkg gives us static libraries, both libraries are over
30MB in size.
This patch reverts the separation and merges LibLocale into LibUnicode
again. We now have just one library that includes the ICU data.
Further, this will let LibUnicode share the locale cache that previously
would only exist in LibLocale.
The IntlMV is meant to be arbitrarily precise. If the user provides a
string value to be formatted, we lose precision by converting extremely
large values to a double. We were never able to address this, as support
for arbitrary precision was a big FIXME. But ICU can handle it by just
passing the raw string on through.
This uses ICU for the Intl.NumberFormat `format` and `formatToParts`
prototypes. It does not yet port the range formatter prototypes.
Most of the new code in LibLocale/NumberFormat is simply mapping from
ECMA-402 types to ICU types. Beyond that, the only algorithmic change is
that we have to mutate the output from ICU for `formatToParts` to match
what is expected by ECMA-402. This is explained in NumberFormat.cpp in
`flatten_partitions`.
This lets us remove most data from our number format generator. All that
remains are numbering system digits and symbols, which are relied upon
still for other interfaces (e.g. Intl.DateTimeFormat). So they will be
removed in a future patch.
Note: All of the changes to the test files in this patch are now aligned
with both Chrome and Safari.
This will be needed by Value::to_string_without_side_effects, which can
be called in contexts without a VM (e.g. in AK::Format specializations).
So to_string_without_side_effects will need to be callable without a VM,
thus NumberToString must be as well.
Three standalone Cell creation functions remain in the JS namespace:
- js_bigint()
- js_string()
- js_symbol()
All of them are leftovers from early iterations when LibJS still took
inspiration from JSC, which itself has jsString(). Nowadays, we pretty
much exclusively use static create() functions to construct types
allocated on the JS heap, and there's no reason to not do the same for
these.
Also change the return type from BigInt* to NonnullGCPtr<BigInt> while
we're here.
This is patch 1/3, replacement of js_string() and js_symbol() follow.
This will make it easier to support both string types at the same time
while we convert code, and tracking down remaining uses.
One big exception is Value::to_string() in LibJS, where the name is
dictated by the ToString AO.
We have a new, improved string type coming up in AK (OOM aware, no null
state), and while it's going to use UTF-8, the name UTF8String is a
mouthful - so let's free up the String name by renaming the existing
class.
Making the old one have an annoying name will hopefully also help with
quick adoption :^)
Instead we just use a specific constructor. With this set of
constructors using curly braces for constructing is highly recommended.
As then it will not do too many implicit conversions which could lead to
unexpected loss of data or calling the much slower double constructor.
Also to ensure we don't feed (Un)SignedBigInteger infinities we throw
RangeError earlier for Durations.
Instead of passing a GlobalObject everywhere, we will simply pass a VM,
from which we can get everything we need: common names, the current
realm, symbols, arguments, the heap, and a few other things.
In some places we already don't actually need a global object and just
do it for consistency - no more `auto& vm = global_object.vm();`!
This will eventually automatically fix the "wrong realm" issue we have
in some places where we (incorrectly) use the global object from the
allocating object, e.g. in call() / construct() implementations. When
only ever a VM is passed around, this issue can't happen :^)
I've decided to split this change into a series of patches that should
keep each commit down do a somewhat manageable size.
The Intl mathematical value is much like ECMA-262's mathematical value
in that it is meant to represent an arbitrarily precise number. The Intl
MV further allows positive/negative infinity, negative zero, and NaN.
This implementation is *not* arbitrarily precise. Rather, it is a
replacement for the use of Value within Intl.NumberFormat. The exact
syntax of the Intl MV is still being worked on, but abstracting this
away into its own class will allow hooking in the finalized Intl MV
more easily, and makes implementing Intl.NumberFormat.formatRange
easier.
Note the methods added here are essentially the same as the static
helpers in Intl/NumberFormat.cpp.