The underlying storage used during string formatting is StringBuilder.
To support UTF-16 strings, this patch allows callers to specify a mode
during StringBuilder construction. The default mode is UTF-8, for which
StringBuilder remains unchanged.
In UTF-16 mode, we treat the StringBuilder's internal ByteBuffer as a
series of u16 code units. Appending a single character will append 2
bytes for that character (cast to a char16_t). Appending a StringView
will transcode the string to UTF-16.
Utf16String also gains the same memory optimization that we added for
String, where we hand-off the underlying buffer to Utf16String to avoid
having to re-allocate.
In the future, we may want to further optimize for ASCII strings. For
example, we could defer committing to the u16-esque storage until we
see a non-ASCII code point.
This is a strictly UTF-16 string with some optimizations for ASCII.
* If created from a short UTF-8 or UTF-16 string that is also ASCII,
then the string is stored in an inlined byte buffer.
* If created with a long UTF-8 or UTF-16 string that is also ASCII,
then the string is stored in an outlined char buffer.
* If created with a short or long UTF-8 or UTF-16 string that is not
ASCII, then the string is stored in an outlined char16 buffer.
We do not store short non-ASCII text in the inlined buffer to avoid
confusion with operations such as `length_in_code_units` and
`code_unit_at`. For example, "😀" would be stored as 4 UTF-8 bytes
in short string form. But we still want `length_in_code_units` to
be 2, and `code_unit_at(0)` to be 0xD83D.
We now do the proper thing in terms of:
- Allowing percentages
- Returning the computed value in getComputedStyle
- Handling values out of the [0,1] range
Gains us 13 WPT passes in the imported tests.
Most locations already make this assumption, but we had a few that would
silently ignore data mismatches. Let's just always assume the type is
correct for now. If a bad actor has a hold of our transport socket, it's
probably better to crash WebContent than to continue on with incorrect
data.
In the future, maybe we will want to throw an exception instead.
Now that these serializers are internal to StructuredSerialize.cpp,
let's put them above Serializer so they don't have to be forward-
declared and explicitly instantiated.
Our structured serialization implementation had its own bespoke encoder
and decoder to serialize JS values. It also used a u32 buffer under the
hood, which made using its structures a bit awkward. We had previously
worked around its data structures in transferable streams, which nested
transfers of MessagePort instances. We basically had to add hooks into
the MessagePort to route to the correct transfer receiving steps, and
we could not invoke the correct AOs directly as the spec dictates.
We now use IPC mechanics to encode and decode data. This works because,
although we are encoding JS values, we are only ultimately encoding
primitive and basic AK types. The resulting data structures actually
enforce that we implement transferable streams exactly as the spec is
worded (I had planned to do that in a separate commit, but the fallout
of this patch actually required that change).
No need to manually prepare / clean up a context. We also previously
would not have done the clean up steps if structured deserialization
threw an exception.
Instead of every branch being of the form:
if (value.is_object() && is<SomeType>(value.as_object()) {
auto& some_type = static_cast<SomeType&>(value.as_object());
}
Let's extract the `is_object` check to an outer branch, and use `as_if`
to check the type.
No functional change, but this makes a future change simpler to review.
Add global registry for registered properties and partial support
for `@property` rule. Enables registering properties with initial
values. Also adds basic retrieval via `var()`.
Note: This is not a complete `@property` implementation.
When dumping the DOM tree, only prefix the element's local name with its
short namespace identifier if it's not the document's default namespace.
This gets rid of the excessive "html:" and "svg:" prefixes in context of
an HTML or SVG document, respectively.
More things need this than just the `<path>` element, so let's avoid
having to include `SVGPathElement.h` in places that don't need it.
Minor changes at the same time:
- Wrap it in a Path class
- Specify underlying type for PathInstructionType
- Make a couple of free functions into methods
- Give PathInstruction an operator==
No functionality changes.
WPT reference tests can add metadata to tests to instruct the test
runner how to interpret the results. Because of this, it is not enough
to have an action that starts loading the (mis)match reference: we need
the test runner to receive the metadata so it can act accordingly.
This sets our test runner up for potentially supporting multiple
(mis)match references, and fuzzy rendering matches - the latter will be
implemented in the following commit.
While 788d5368a7 took care of better text
marker positioning, this improves graphical marker positioning instead.
By looking at how Firefox and Chrome render markers, it's clear that
there are three parts to positioning a graphical marker:
* The containing space that the marker resides in;
* The marker dimensions;
* The distance between the marker and the start of the list item.
The space that the marker can be contained in, is the area to the left
of the list item with a height of the marker's line-height. The marker
dimensions are relative to the marker's font's pixel size: most of them
are a square at 35% of the font size, but the disclosure markers are
sized at 50% instead. Finally, the marker distance is always gauged at
50% of the font size.
So for example, a list item with `list-style-type: disc` and `font-size:
20px`, has 10px between its start and the right side of the marker, and
the marker's dimensions are 7x7.
The percentages I've chosen closely resemble how Firefox lays out its
list item markers.
This change converts `Node::invalidate_style()` (invalidation sets
overload) from eagerly doing tree traversal that marks elements affected
by invalidation set to instead adding "pending invalidation sets" into
`StyleInvalidator`, processing of which is deferred until the next
`update_style()`. By doing that we sometimes substantially reduce amount
of work done performing tree traversal that marks elements for style
recalculation.
Improves performance on Discord, were according to my measurements we
were previously spending 20% of time in style invalidation, but now it's
down to <1%.
This change introduces StyleInvalidator as a preparation for upcoming
change that will make `perform_pending_style_invalidations()` take care
of pending invalidation sets.
Note that it's not actually executing tasks in parallel, it's still
throwing them on the HTML event loop task queue, each with its own
unique task source.
This makes our fetch implementation a lot more robust when HTTP caching
is enabled, and you can now click links on https://terminal.shop/
without hitting TODO assertions in fetch.
`<syntax>` is a limited subset of the "value definition syntax" used in
CSS specs. It's used for `@property`'s `syntax` descriptor, and for the
`type()` function in `attr()`.
UnresolvedStyleValue::create() has one user where we know if there are
any arbitrary substitution functions in the list of CVs, and two users
where we don't know and just hope there aren't any. I'm about to add
another user that also doesn't know, and so it seems worth just making
UnresolvedStyleValue::create() do that work instead.
We keep the parameter, now Optional<>, so that we save some redundant
work in that one place where we do already know.
A couple of arbitrary substitution functions require us to get or
produce some style value, and then substitute its ComponentValues into
the original ComponentValue list. So this commit gives CSSStyleValue a
tokenize() method that does so.
Apart from a couple of unusual cases like the guaranteed-invalid value,
style values can all be converted into ComponentValues by serializing
them as a string, and then parsing that as a list of component values.
That feels unnecessarily inefficient in most cases though, so I've
implemented faster overrides for a lot of the basic style value
classes, but left that serialize-and-reparse method as the fallback.
`CSSColorValue`s which have unresolved `calc` components should be able
to be resolved. Previously we would always resolve them but with
incorrect values.
This is useful as we will now be able to now whether we should serialize
colors in their normalized form or not.
Slight regression in that we now serialize (RGB, HSL and HWB) colors
with components that rely on compute-time information as an empty
string, but that will be fixed in the next commit.
These new methods are built on top of the spec's
`simplify_a_calculation_tree` algorithm where the old methods were
ad-hoc.
These methods are not used anywhere yet as callers will need to be
migrated over from the deprecated methods one-by-one to account for
differences in behaviour.
No functionality changes.
The existing resolve methods are not to spec and we are working to
replace them with new ones based on the `simplify_a_calculation_tree`
method.
These are marked as deprecated rather than replaced outright as work
will need to be done on the caller side to be made compatible with the
new methods, for instance the new methods can fail to resolve (e.g.
if we are missing required context), where the existing methods will
always resolve (albeit sometimes with an incorrect value).
No functionality changes.
When setting a declaration for a property in a logical property group,
it should appear after all other declarations which belong to the same
property group but have different mapping logic (are/aren't a logical
alias).
Gains us 1 WPT pass.
We should not serialize a group of properties `longhands` as a single
shorthand if there is any property declared between the first and
last property in `longhands` which is not part of `longhands` but
belongs to the same logical property group, and has different mapping
logic to any of property in `longhands`