Fetched bodies can be on the order of gigabytes, so rather than crashing
when we hit OOM here, we can simply invoke the error callback with a DOM
exception. We use "UnknownError" here as the spec directly supports this
for OOM errors:
UnknownError: The operation failed for an unknown transient reason
(e.g. out of memory).
This is still an ad-hoc implementation. We should be using streams, and
we do have the AOs available to do so. But they need to be massaged to
be compatible with callers of Body::fully_read. And once we do use
streams, this function will become infallible - so making it infallible
here is at least a step in the right direction.
Changes the signature of queue_fetch_task() from AK:Function to
JS::HeapFunction to be more clear to the user of the function that this
is what it uses internally.
This is a fetching AO and is only used by LibWeb in the context of fetch
tasks. Move it to LibWeb with other fetch methods.
The main reason for this is that it requires the use of other LibWeb AOs
such as the forgiving Base64 decoder and MIME sniffing. These AOs aren't
available within LibURL.
The HTMLMediaElement, for example, contains spec text which states any
ongoing fetch process must be "stopped". The spec does not indicate how
to do this, so our implementation is rather ad-hoc.
Our current implementation may cause a crash in places that assume one
of the fetch algorithms that we set to null is *not* null. For example:
if (fetch_params.process_response) {
queue_fetch_task([]() {
fetch_params.process_response();
};
}
If the fetch process is stopped after queuing the fetch task, but not
before the fetch task is run, we will crash when running this fetch
algorithm.
We now track queued fetch tasks on the fetch controller. When the fetch
process is stopped, we cancel any such pending task.
It is a little bit awkward maintaining a fetch task ID. Ideally, we
could use the underlying task ID throughout. But we do not have access
to the underlying task nor its ID when the task is running, at which
point we need some ID to remove from the pending task list.
This URL library ends up being a relatively fundamental base library of
the system, as LibCore depends on LibURL.
This change has two main benefits:
* Moving AK back more towards being an agnostic library that can
be used between the kernel and userspace. URL has never really fit
that description - and is not used in the kernel.
* URL _should_ depend on LibUnicode, as it needs punnycode support.
However, it's not really possible to do this inside of AK as it can't
depend on any external library. This change brings us a little closer
to being able to do that, but unfortunately we aren't there quite
yet, as the code generators depend on LibCore.
We previously used an empty optional to denote that a ReferrerPolicy is
in the default empty string state. However, later additions added an
explicit EmptyString state. This patch moves all users to the explicit
state, and stops using `Optional<ReferrerPolicy>` everywhere except for
when an option not being passed from JavaScript has meaning.
Along with putting functions in the URL namespace into a DOMURL
namespace.
This is done as LibWeb is in an awkward situation where it needs
two URL classes. AK::URL is the general purpose URL class which
is all that is needed in 95% of cases. URL in the Web namespace
is needed predominantly for interfacing with the javascript
interfaces.
Because of two URLs in the same namespace, AK::URL has had to be
used throughout LibWeb. If we move AK::URL into a URL namespace,
this becomes more painful - where ::URL::URL is required to
specify the constructor (and something like
::URL::create_with_url_or_path in other places).
To fix this problem - rename the class in LibWeb implementing the
URL IDL interface to DOMURL, along with moving the other Web URL
related classes into this DOMURL folder.
One could argue that this name also makes the situation a little
more clear in LibWeb for why these two URL classes need be used
in the first place.
The resource:// scheme is used for Core::Resource files. Currently, any
users of resource:// URLs in Ladybird must manually create the Resource
and extract its data. This will allow for passing the resource:// URL
along for LibWeb to handle.
This commit un-deprecates DeprecatedString, and repurposes it as a byte
string.
As the null state has already been removed, there are no other
particularly hairy blockers in repurposing this type as a byte string
(what it _really_ is).
This commit is auto-generated:
$ xs=$(ack -l \bDeprecatedString\b\|deprecated_string AK Userland \
Meta Ports Ladybird Tests Kernel)
$ perl -pie 's/\bDeprecatedString\b/ByteString/g;
s/deprecated_string/byte_string/g' $xs
$ clang-format --style=file -i \
$(git diff --name-only | grep \.cpp\|\.h)
$ gn format $(git ls-files '*.gn' '*.gni')
With this change, we now have ~1200 CellAllocators across both LibJS and
LibWeb in a normal WebContent instance.
This gives us a minimum heap size of 4.7 MiB in the scenario where we
only have one cell allocated per type. Of course, in practice there will
be many more of each type, so the effective overhead is quite a bit
smaller than that in practice.
I left a few types unconverted to this mechanism because I got tired of
doing this. :^)
This is a hack on top of a hack because Workers don't *really* need to
have a Web::Page at all, but the ResourceLoader infra that should be
going away soon ™️ is not quite ready to axe that requirement for
cookies.
In FetchAlgorithms, it is common for callbacks to capture realms. This
can indirectly keep objects alive that hold FetchController with these
callbacks. This creates a cyclic dependency. However, when
JS::HeapFunction is used, this is not a problem, as captured by
callbacks values do not create new roots.
Stop worrying about tiny OOMs. Work towards #20449.
While going through these, I also changed the function signature in many
places where returning ThrowCompletionOr<T> is no longer necessary.
This adds a simple and incomplete implementation for extracting some
specific CORS headers that are used by fetch. This unifies the existing
ad-hoc parsing that already existed for Access-Control-Allow-Headers
and Access-Control-Allow-Methods, as well as adding
Access-control-Expose-Headers.
This adds the headers named in Access-Control-Expose-Headers to the
response's CORS-exposed header-name list which allows those headers to
be accessed from JS.
Parsing 'data:' URLs took it's own route. It never set standard URL
fields like path, query or fragment (except for scheme) and instead
gave us separate methods called `data_payload()`, `data_mime_type()`,
and `data_payload_is_base64()`.
Because parsing 'data:' didn't use standard fields, running the
following JS code:
new URL('#a', 'data:text/plain,hello').toString()
not only cleared the path as URLParser doesn't check for data from
data_payload() function (making the result be 'data:#a'), but it also
crashes the program because we forbid having an empty MIME type when we
serialize to string.
With this change, 'data:' URLs will be parsed like every other URLs.
To decode the 'data:' URL contents, one needs to call process_data_url()
on a URL, which will return a struct containing MIME type with already
decoded data! :^)
In order to follow spec text to achieve this, we need to change the
underlying representation of a host in AK::URL to deserialized format.
Before this, we were parsing the host and then immediately serializing
it again.
Making that change resulted in a whole bunch of fallout.
After this change, callers can access the serialized data through
this concept-host-serializer. The functional end result of this
change is that IPv6 hosts are now correctly serialized to be
surrounded with '[' and ']'.