Commit graph

119 commits

Author SHA1 Message Date
Andreas Kling
88e7688940 LibWeb: Make more MimeSniff::MimeType APIs infallible 2024-10-14 20:47:35 +02:00
rmg-x
9dd7b901ab LibWeb/ResourceLoader: Display HTTP reason phrase in error status 2024-10-11 07:43:23 +01:00
rmg-x
ff18114ae7 LibWeb+LibRequests+RequestServer: Report network error on request finish
This allows us to bubble up network errors to API consumers after
finishing a request.
2024-10-10 19:56:11 +01:00
Timothy Flynn
00487a7b25 LibWebView+LibWeb: Remove ResourceLoader and WebSocket adapters
These were used to provide a layer of abstraction between ResourceLoader
and the networking backend. Now that we only have RequestServer, we can
remove these adapters to make the code a bit easier to follow.
2024-10-08 06:52:51 +02:00
Andreas Kling
cc4b3cbacc Meta: Update my e-mail address everywhere 2024-10-04 13:19:50 +02:00
Andreas Kling
32299e74cb LibWeb: Make CSS font loader tolerate WPT web server shenanigans
The web server for WPT has a tendency to just disconnect after sending
us a resource. This makes curl think an error occurred, but it's
actually still recoverable and we have the data.

So instead of just bailing, do what we already do for other kinds of
resources and try to parse the data we got. If it works out, great!

It would be nice to solve this in the networking layer instead, but
I'll leave that as an exercise for our future selves.
2024-09-21 19:20:30 +02:00
Timothy Flynn
e74d2b1762 LibWeb+LibWebView: Set the default path for invalid cookie Path values
We were missing this spec step when parsing the Path attribute.
2024-09-19 00:01:56 +01:00
Shannon Booth
cc55732332 LibURL+Everywhere: Only percent decode URL paths when actually needed
Web specs do not return through javascript percent decoded URL path
components - but we were doing this in a number of places due to the
default behaviour of URL::serialize_path.

Since percent encoded URL paths may not contain valid UTF-8 - this was
resulting in us crashing in these places.

For example - on an HTMLAnchorElement when retrieving the pathname for
the URL of:

http://ladybird.org/foo%C2%91%91

To fix this make the URL class only return the percent encoded
serialized path, matching the URL spec. When the decoded path is
required instead explicitly call URL::percent_decode.

This fixes a crash running WPT URL tests for the anchor element on:

https://wpt.live/url/a-element.html
2024-08-05 09:58:13 +02:00
Jamie Mansfield
2ca8fd1832 LibWeb: Make preferred languages configurable
This also changes fetch to use the preferred languages for the
Accept-Language header.
2024-07-25 11:38:59 +01:00
Andreas Kling
2c918b540d LibWeb+RequestServer: Let RequestServer set HTTP Accept-Encoding header
Ultimately it's RequestServer who knows which kind of encodings it can
handle and decompress, so let's have it set the Accept-Encoding.
2024-07-16 11:01:01 +02:00
circl
91e3ef6dbf LibWeb/ResourceLoader: Report file: errors as "network errors"
This triggers the generated error page which is more informative.
2024-07-05 15:08:13 -06:00
circl
e35b055192 LibWeb/ResourceLoader: Call error callback if resource: load fails 2024-07-05 15:08:13 -06:00
Tim Ledbetter
66662496e6 LibWeb: Remove ResourceLoader::emit_signpost()
This was only ever used within SerenityOS. The current implementation
of this method did nothing.
2024-07-05 07:15:44 +02:00
Jamie Mansfield
1128375dff LibWeb: Support Gecko and WebKit navigator compatibility modes
This will log a debug message when calls are made to the
NavigatorIDMixin that make decisions based on the compatibility mode.
2024-07-05 07:14:03 +02:00
rmg-x
ae983a2ef7 LibWeb: Add log_filtered_request() method in ResourceLoader
This will reduce log noise when visiting sites
that have a lot of filtered content.

Previously, red error text would be displayed in
the logs for each filtered URL.
2024-07-04 21:45:30 +01:00
Andreas Kling
847243b706 LibWeb: Only inject "User-Agent"/"Accept" headers when they're missing
...otherwise we send out HTTP requests with duplicates of these headers.
2024-06-19 08:09:10 +02:00
circl
c169e43e13 Userland: Remove some SerenityOS checks 2024-06-10 13:53:01 +02:00
Andreas Kling
260c5c50ad LibHTTP+RequestServer: Use HTTP::HeaderMap for request headers
No longer just for response headers! The same type is obviously useful
and ergonomic when making requests as well.
2024-06-09 15:34:02 +02:00
Andreas Kling
5ac0938859 LibHTTP+LibWeb: Stop bundling "Set-Cookie" headers as JSON
Before we had HTTP::HeaderMap (which preserves multiple headers with the
same name), we collected multiple "Set-Cookie" headers and bundled them
together as a JSON array.

This was a huge hack, and now we can stop doing that, since LibWeb gets
access to the full set of headers now.
2024-06-09 15:34:02 +02:00
Andreas Kling
e636851481 LibHTTP+RequestServer: Add HTTP::HeaderMap and use for response headers
Instead of using a HashMap<ByteString, ByteString, CaseInsensitive...>
everywhere, we now encapsulate this in a class.

Even better, the new class also allows keeping track of multiple headers
with the same name! This will make it possible for HTTP responses to
actually retain all their headers on the perilous journey from
RequestServer to LibWeb.
2024-06-09 15:34:02 +02:00
Andreas Kling
f2fd8fc928 Everywhere: Remove LibGemini
This hasn't been maintained (or worked at all) for a long time,
and it's not a widely supported protocol, so let's drop it.
2024-06-04 09:19:39 +02:00
Jamie Mansfield
295c4ef51a LibWeb/Fetch: Use MimeType in DataURL 2024-06-02 19:55:53 +02:00
Timothy Flynn
1e97ae66e5 LibWeb: Support unbuffered resource load requests
This adds an alternate API to ResourceLoader to load HTTP/HTTPS/Gemini
requests unbuffered. Most of the changes here are moving parts of the
existing ResourceLoader::load method to helper methods so they can be
re-used by the new ResourceLoader::load_unbuffered.
2024-05-26 18:29:24 +02:00
Timothy Flynn
168d28c15f LibProtocol+Userland: Support unbuffered protocol requests
LibWeb will need to use unbuffered requests to support server-sent
events. Connection for such events remain open and the remote end sends
data as HTTP bodies at its leisure. The browser needs to be able to
handle this data as it arrives, as the request essentially never
finishes.

To support this, this make Protocol::Request operate in one of two
modes: buffered or unbuffered. The existing mechanism for setting up a
buffered request was a bit awkward; you had to set specific callbacks,
but be sure not to set some others, and then set a flag. The new
mechanism is to set the mode and the callbacks that the mode needs in
one API.
2024-05-26 18:29:24 +02:00
Timothy Flynn
086ddd481d Ladybird+LibWeb: Move User-Agent definitions to their own file
This is to avoid including any LibProtocol header in Objective-C source
files, which will cause a conflict between the Protocol namespace and a
@Protocol interface.

See Ladybird/AppKit/Application/ApplicationBridge.cpp for why this
conflict unfortunately cannot be worked around.
2024-05-26 18:29:24 +02:00
Shannon Booth
d200704b6d LibWeb: Add Last-Modified header to file and resource requests 2024-04-02 07:51:02 +02:00
Andreas Kling
3d78f86a2f LibWeb: Don't ask RequestServer to prefetch DNS for file: and data: URLs 2024-03-26 11:39:45 +01:00
Timothy Flynn
24ecf31ff5 LibURL+LibWeb: Move data URL processing to LibWeb's fetch infrastructure
This is a fetching AO and is only used by LibWeb in the context of fetch
tasks. Move it to LibWeb with other fetch methods.

The main reason for this is that it requires the use of other LibWeb AOs
such as the forgiving Base64 decoder and MIME sniffing. These AOs aren't
available within LibURL.
2024-03-25 08:13:27 +01:00
Shannon Booth
e800605ad3 AK+LibURL: Move AK::URL into a new URL library
This URL library ends up being a relatively fundamental base library of
the system, as LibCore depends on LibURL.

This change has two main benefits:
 * Moving AK back more towards being an agnostic library that can
   be used between the kernel and userspace. URL has never really fit
   that description - and is not used in the kernel.
 * URL _should_ depend on LibUnicode, as it needs punnycode support.
   However, it's not really possible to do this inside of AK as it can't
   depend on any external library. This change brings us a little closer
   to being able to do that, but unfortunately we aren't there quite
   yet, as the code generators depend on LibCore.
2024-03-18 14:06:28 -04:00
Shannon Booth
9ce8189f21 Everywhere: Use unqualified AK::URL
Now possible in LibWeb now that there is no longer a Web::URL.
2024-02-25 08:54:31 +01:00
Andrew Kaster
e7daa02bf2 LibWeb: Unblock port 9000
This was blocked because it can be used for cross-protocol attacks on
some network printers. However, it's also used by the web platform
tests. One can argue that getting WPT working is more important than
theoretical attacks on poorly configured printers.
2024-02-12 11:43:22 -07:00
Bastiaan van der Plaat
cde14901bc Ladybird+LibWeb: Add initial about:version internal page 2024-01-13 13:41:09 -05:00
Bastiaan van der Plaat
05c0640474 Ladybird+LibWeb: Add about scheme support for internal pages 2024-01-13 13:41:09 -05:00
Andreas Kling
bf5ad56085 LibWeb: Ignore preconnect requests for file: and data: URLs
I noticed while debugging a fully downloaded page that it was trying
to preconnect to a file:// host. That doesn't make any sense, so let's
add a tiny bit of logic to ignore preconnect requests for file: and
data: URLs.
2023-12-30 13:49:50 +01:00
Bastiaan van der Plaat
f8feca5d21 LibWeb: Use directory page when viewing a resource schemed directory URL 2023-12-27 10:54:07 -05:00
Timothy Flynn
e511a264fe LibWeb: Implement ad-hoc steps to allow LibWeb to load resource:// URLs
The resource:// scheme is used for Core::Resource files. Currently, any
users of resource:// URLs in Ladybird must manually create the Resource
and extract its data. This will allow for passing the resource:// URL
along for LibWeb to handle.
2023-12-24 14:09:23 +01:00
Ali Mohammad Pur
5e1499d104 Everywhere: Rename {Deprecated => Byte}String
This commit un-deprecates DeprecatedString, and repurposes it as a byte
string.
As the null state has already been removed, there are no other
particularly hairy blockers in repurposing this type as a byte string
(what it _really_ is).

This commit is auto-generated:
  $ xs=$(ack -l \bDeprecatedString\b\|deprecated_string AK Userland \
    Meta Ports Ladybird Tests Kernel)
  $ perl -pie 's/\bDeprecatedString\b/ByteString/g;
    s/deprecated_string/byte_string/g' $xs
  $ clang-format --style=file -i \
    $(git diff --name-only | grep \.cpp\|\.h)
  $ gn format $(git ls-files '*.gn' '*.gni')
2023-12-17 18:25:10 +03:30
Aliaksandr Kalenik
6ac43274b2 LibWeb+LibJS: Use JS::GCPtr for pointers to GC-allocated objects
Fixes warnings found by LibJSGCVerifier
2023-12-11 16:55:25 +01:00
Andrew Kaster
04670c7a06 LibWeb: Hide load started/completed debug messages behind SPAM_DEBUG 2023-12-08 20:04:13 -05:00
Shannon Booth
6aff55d655 LibWeb: Port NavigatorID from DeprecatedString to String 2023-11-20 15:00:19 +01:00
Andrew Kaster
ea95256f83 LibWeb: Remove unused ResourceLoader::load(URL) overload
With the refactoring of Workers, nobody is calling this LoadRequest-less
overload. All new code should eventually be moved to fetch anyway.
2023-11-15 12:56:33 +01:00
Andrew Kaster
d7d84ee931 LibWeb: Ensure a Web::Page is associated with local Worker LoadRequests
This is a hack on top of a hack because Workers don't *really* need to
have a Web::Page at all, but the ResourceLoader infra that should be
going away soon ™️ is not quite ready to axe that requirement for
cookies.
2023-11-15 12:56:33 +01:00
Aliaksandr Kalenik
b9e0ad4358 LibWeb: Make ResourceLoader pass body and headers in error callback
Pass body and headers of a failed request to callback so caller can
process them.
2023-10-03 09:41:56 +02:00
Bastiaan van der Plaat
8f2319e966 Ladybird+LibWeb: Rename FileDirectoryLoader to GeneratedPagesLoader 2023-09-24 19:59:00 -06:00
Bastiaan van der Plaat
eafdb06d87 LibWeb: Add directory entries page when visiting a local directory 2023-08-15 10:41:54 +01:00
auipc
1653c5ea41 LibWeb: Use current platform for navigator.platform
Before, navigator.platform would always report the platform as "Serenity
OS", regardless of whether or not that was true. It also did not include
the architecture, which Firefox and Chrome both do. Now, it can report
either "Linux x86_64" or "SerenityOS AArch64".
2023-08-13 05:13:18 +02:00
Karol Kosek
eb41f0144b AK: Decode data URLs to separate class (and parse like every other URL)
Parsing 'data:' URLs took it's own route. It never set standard URL
fields like path, query or fragment (except for scheme) and instead
gave us separate methods called `data_payload()`, `data_mime_type()`,
and `data_payload_is_base64()`.

Because parsing 'data:' didn't use standard fields, running the
following JS code:

    new URL('#a', 'data:text/plain,hello').toString()

not only cleared the path as URLParser doesn't check for data from
data_payload() function (making the result be 'data:#a'), but it also
crashes the program because we forbid having an empty MIME type when we
serialize to string.

With this change, 'data:' URLs will be parsed like every other URLs.
To decode the 'data:' URL contents, one needs to call process_data_url()
on a URL, which will return a struct containing MIME type with already
decoded data! :^)
2023-08-01 14:19:05 +02:00
Karol Kosek
f27b9b9563 LibWeb: Set Content-Type for data: URLs instead of checking MIME on load
This makes the loader more agnostic.

Additionally, this allows us to load tab in Ladybird with a 'data:' URL
containing parameters, as a Resource will now call
`mime_type_from_content_type` to extract the content type from MIME. :^)
2023-08-01 14:19:05 +02:00
Daniel
d8c1150f6b LibWeb: Respect "no-store" directive in cache-control header 2023-06-22 21:24:23 +02:00
MacDue
35612c6a7f AK+Everywhere: Change URL::path() to serialize_path()
This now defaults to serializing the path with percent decoded segments
(which is what all callers expect), but has an option not to. This fixes
`file://` URLs with spaces in their paths.

The name has been changed to serialize_path() path to make it more clear
that this method will generate a new string each call (except for the
cannot_be_a_base_url() case). A few callers have then been updated to
avoid repeatedly calling this function.
2023-04-15 06:37:04 +02:00