We currently have 2 base64 coders: one in AK, another in LibWeb for a
"forgiving" implementation. ECMA-262 has an upcoming proposal which will
require a third implementation.
Instead, let's use the base64 implementation that is used by Node.js and
recommended by the upcoming proposal. It handles forgiving decoding as
well.
Our users of AK's implementation should be fine with the forgiving
implementation. The AK impl originally had naive forgiving behavior, but
that was removed solely for performance reasons.
Using http://mattmahoney.net/dc/enwik8.zip (100MB unzipped) as a test,
performance of our old home-grown implementations vs. the simdutf
implementation (on Linux x64):
Encode Decode
AK base64 0.226s 0.169s
LibWeb base64 N/A 1.244s
simdutf 0.161s 0.047s
Supporting unbuffered fetches is actually part of the fetch spec in its
HTTP-network-fetch algorithm. We had previously implemented this method
in a very ad-hoc manner as a simple wrapper around ResourceLoader. This
is still the case, but we now implement a good amount of these steps
according to spec, using ResourceLoader's unbuffered API. The response
data is forwarded through to the fetch response using streams.
This will eventually let us remove the use of ResourceLoader's buffered
API, as all responses should just be streamed this way. The streams spec
then supplies ways to wait for completion, thus allowing fully buffered
responses. However, we have more work to do to make the other parts of
our fetch implementation (namely, Body::fully_read) use streams before
we can do this.
This callback is meant to be triggered by streams, which does not always
provide a WebIDL::DOMException. Pass a plain value instead. Of all the
users of this callback, only one actually uses the value, and already
converts the DOMException to a plain value.
Fetched bodies can be on the order of gigabytes, so rather than crashing
when we hit OOM here, we can simply invoke the error callback with a DOM
exception. We use "UnknownError" here as the spec directly supports this
for OOM errors:
UnknownError: The operation failed for an unknown transient reason
(e.g. out of memory).
This is still an ad-hoc implementation. We should be using streams, and
we do have the AOs available to do so. But they need to be massaged to
be compatible with callers of Body::fully_read. And once we do use
streams, this function will become infallible - so making it infallible
here is at least a step in the right direction.
The only subclass was already GC-allocated, so let's hoist the JS::Cell
inheritance up one level. This ends up simplifying a bit of rather
dubious looking code where we were previously slicing ESOs.
Changes the signature of queue_fetch_task() from AK:Function to
JS::HeapFunction to be more clear to the user of the function that this
is what it uses internally.
Changes the signature of queue_global_task() from AK:Function to
JS::HeapFunction to be more clear to the user of the function that this
is what it uses internally.
...and use HeapFunction instead of SafeFunction for task steps.
Since there is only one EventLoop per process, it lives as a global
handle in the VM custom data.
This makes it much easier to reason about lifetimes of tasks, task
steps, and random stuff captured by them.
We were off-by-one when returning the result of parsing a quoted string
in Web::Fetch::Infrastructure::collect_an_http_quoted_string. Instead of
backtracking the lexer and consuming the backtracked string, do a simple
substring operation.
This is a fetching AO and is only used by LibWeb in the context of fetch
tasks. Move it to LibWeb with other fetch methods.
The main reason for this is that it requires the use of other LibWeb AOs
such as the forgiving Base64 decoder and MIME sniffing. These AOs aren't
available within LibURL.
The HTMLMediaElement, for example, contains spec text which states any
ongoing fetch process must be "stopped". The spec does not indicate how
to do this, so our implementation is rather ad-hoc.
Our current implementation may cause a crash in places that assume one
of the fetch algorithms that we set to null is *not* null. For example:
if (fetch_params.process_response) {
queue_fetch_task([]() {
fetch_params.process_response();
};
}
If the fetch process is stopped after queuing the fetch task, but not
before the fetch task is run, we will crash when running this fetch
algorithm.
We now track queued fetch tasks on the fetch controller. When the fetch
process is stopped, we cancel any such pending task.
It is a little bit awkward maintaining a fetch task ID. Ideally, we
could use the underlying task ID throughout. But we do not have access
to the underlying task nor its ID when the task is running, at which
point we need some ID to remove from the pending task list.
This URL library ends up being a relatively fundamental base library of
the system, as LibCore depends on LibURL.
This change has two main benefits:
* Moving AK back more towards being an agnostic library that can
be used between the kernel and userspace. URL has never really fit
that description - and is not used in the kernel.
* URL _should_ depend on LibUnicode, as it needs punnycode support.
However, it's not really possible to do this inside of AK as it can't
depend on any external library. This change brings us a little closer
to being able to do that, but unfortunately we aren't there quite
yet, as the code generators depend on LibCore.
We previously used an empty optional to denote that a ReferrerPolicy is
in the default empty string state. However, later additions added an
explicit EmptyString state. This patch moves all users to the explicit
state, and stops using `Optional<ReferrerPolicy>` everywhere except for
when an option not being passed from JavaScript has meaning.
Along with putting functions in the URL namespace into a DOMURL
namespace.
This is done as LibWeb is in an awkward situation where it needs
two URL classes. AK::URL is the general purpose URL class which
is all that is needed in 95% of cases. URL in the Web namespace
is needed predominantly for interfacing with the javascript
interfaces.
Because of two URLs in the same namespace, AK::URL has had to be
used throughout LibWeb. If we move AK::URL into a URL namespace,
this becomes more painful - where ::URL::URL is required to
specify the constructor (and something like
::URL::create_with_url_or_path in other places).
To fix this problem - rename the class in LibWeb implementing the
URL IDL interface to DOMURL, along with moving the other Web URL
related classes into this DOMURL folder.
One could argue that this name also makes the situation a little
more clear in LibWeb for why these two URL classes need be used
in the first place.
Just creating a stream on the JS heap isn't enough, as we will later
crash when trying to read from that stream as it hasn't been properly
initialized. Instead, until we have teeing implemented (which is a
rather huge part of the Streams spec), create streams using proper AOs
that do initialize the stream.