This constructor has undergone a handful of editorial changes that we
fell behind on. But we weren't able to take the updates until now due to
a spec bug in those updates. See:
3f029b0
The result is that we can remove the inheritance of Intl::DateTimeFormat
from Unicode::DateTimeFormat; the former now contains the latter as an
internal slot.
This required updating some LibJS spec steps to their latest versions,
as the data expected by the old steps does not quite match the APIs that
are available with the ICU. The new spec steps are much more aligned.
LibLocale was split off from LibUnicode a couple years ago to reduce the
number of applications on SerenityOS that depend on CLDR data. Now that
we use ICU, both LibUnicode and LibLocale are actually linking in this
data. And since vcpkg gives us static libraries, both libraries are over
30MB in size.
This patch reverts the separation and merges LibLocale into LibUnicode
again. We now have just one library that includes the ICU data.
Further, this will let LibUnicode share the locale cache that previously
would only exist in LibLocale.
There have been a number of changes to the locale resolution AOs that
we've fallen behind on. Mostly editorial, but includes one normative
change to canonicalize Unicode extension keywords in the Intl.Locale
constructor.
We were previously treating undefined and null as the same (an empty
Optional). However, there are edge cases in ECMA-402 where we must treat
them differently. Namely, the hour cycle (hc) keyword. An undefined hc
value has no effect on the resolved locale, whereas a null hc value can
actively override any hc specified in the locale string. For example:
new Intl.DateTimeFormat("en-u-hc-h11", { hour12: false });
In that object, the hour12 option does not match the u-hc-h11 value. So
the spec dictates we remove the hc value by setting it to null.
This uses ICU for the Intl.DateTimeFormat `format` `formatToParts`,
`formatRange`, and `formatRangeToParts`.
This lets us remove most data from our date-time format generator. All
that remains are time zone data and locale week info, which are relied
upon still for other interfaces. So they will be removed in a future
patch.
Note: All of the changes to the test files in this patch are now aligned
with other browsers. This includes:
* Some very incorrect formatting of Japanese symbols. (Looking at the
old results now, it's very obvious they were wrong.)
* Old FIXMEs regarding range formatting not including the start/end date
when only time fields were requested, but the dates differ.
* Day period inconsistencies.
This patch adds two macros to declare per-type allocators:
- JS_DECLARE_ALLOCATOR(TypeName)
- JS_DEFINE_ALLOCATOR(TypeName)
When used, they add a type-specific CellAllocator that the Heap will
delegate allocation requests to.
The result of this is that GC objects of the same type always end up
within the same HeapBlock, drastically reducing the ability to perform
type confusion attacks.
It also improves HeapBlock utilization, since each block now has cells
sized exactly to the type used within that block. (Previously we only
had a handful of block sizes available, and most GC allocations ended
up with a large amount of slack in their tails.)
There is a small performance hit from this, but I'm sure we can make
up for it elsewhere.
Note that the old size-based allocators still exist, and we fall back
to them for any type that doesn't have its own CellAllocator.
Since 2023-09-08, Clang trunk has had a bug which causes a segfault when
evaluating certain `requires` expressions inside templated lambdas.
There isn't an imminent fix on the horizon, so let's work around the
issue by specifying the type of the offending lambda arguments
explicitly.
See https://github.com/llvm/llvm-project/issues/67260
These are normative changes in the ECMA-402 spec. See:
896ffccaf4ec46e25c455
(This combines the above commits into one patch as they each do not work
on their own).
This is an editorial change in the ECMA-262 spec. See:
73926a5
The idea here is to reduce duplication of these AOs between ECMA-262,
ECMA-402, and Temporal. This patch contains only the ECMA-262 changes.
These APIs only perform small allocations, and are only used by LibJS
and the time zone settings widget. Callers which could only have failed
from these APIs are also made to be infallible here.
This is a normative change in the ECMA-402 spec. See:
02bd03a
This is observable just due to reading the properties one less time. It
would have been possible for e.g. the property values to change between
those invocations.
This is an editorial change in the ECMA-402 spec. See:
b6ed64b6e3b70d
This moves Intl.DateTimeFormat creation to InitializeDateTimeFormat and
renames that AO to CreateDateTimeFormat.
Some of these are allocated upon initialization of the intrinsics, and
some lazily, but in neither case the getters actually return a nullptr.
This saves us a whole bunch of pointer dereferences (as NonnullGCPtr has
an `operator T&()`), and also has the interesting side effect of forcing
us to explicitly use the FunctionObject& overload of call(), as passing
a NonnullGCPtr is ambigous - it could implicitly be turned into a Value
_or_ a FunctionObject& (so we have to dereference manually).
Note that as of this commit, there aren't any such throwers, and the
call site in Heap::allocate will drop exceptions on the floor. This
commit only serves to change the declaration of the overrides, make sure
they return an empty value, and to propagate OOM errors frm their base
initialize invocations.
This makes construction of Utf16String fallible in OOM conditions. The
immediate impact is that PrimitiveString must then be fallible as well,
as it may either transcode UTF-8 to UTF-16, or create a UTF-16 string
from ropes.
There are a couple of places where it is very non-trivial to propagate
the error further. A FIXME has been added to those locations.
We return early from the DateTimeFormat constructor to avoid crashing on
assertions when the CLDR is disabled. However, after commit 019211b, the
spec now mandates we assert the time zone identifier is valid. The early
return resulted in this identifier being an empty string.
This will make it easier to support both string types at the same time
while we convert code, and tracking down remaining uses.
One big exception is Value::to_string() in LibJS, where the name is
dictated by the ToString AO.
We have a new, improved string type coming up in AK (OOM aware, no null
state), and while it's going to use UTF-8, the name UTF8String is a
mouthful - so let's free up the String name by renaming the existing
class.
Making the old one have an annoying name will hopefully also help with
quick adoption :^)
This is an editorial change to the ECMA-402 spec. See:
46aa5cc
Also add an ECMA-402 spec link to the DefaultTimeZone implementation, as
that definition supersedes ECMA-262.
In a subclass of Cell, we cannot use Cell::vm() before the base Cell
object itself is constructed. Use the Realm's VM instead.
This was caught by UBSAN with vptr sanitation enabled.
Intrinsics, i.e. mostly constructor and prototype objects, but also
things like empty and new object shape now live on a new heap-allocated
JS::Intrinsics object, thus completing the long journey of taking all
the magic away from the global object.
This represents the Realm's [[Intrinsics]] slot in the spec and matches
its existing [[GlobalObject]] / [[GlobalEnv]] slots in terms of
architecture.
In the majority of cases it should now be possibly to fully allocate a
regular object without the global object existing, and in fact that's
what we do now - the realm is allocated before the global object, and
the intrinsics between both :^)
This is needed so that the allocated NativeFunction receives the correct
realm, usually forwarded from the Object's initialize() function, rather
than using the current realm.