Commit graph

108 commits

Author SHA1 Message Date
circl
e35b055192 LibWeb/ResourceLoader: Call error callback if resource: load fails 2024-07-05 15:08:13 -06:00
Tim Ledbetter
66662496e6 LibWeb: Remove ResourceLoader::emit_signpost()
This was only ever used within SerenityOS. The current implementation
of this method did nothing.
2024-07-05 07:15:44 +02:00
Jamie Mansfield
1128375dff LibWeb: Support Gecko and WebKit navigator compatibility modes
This will log a debug message when calls are made to the
NavigatorIDMixin that make decisions based on the compatibility mode.
2024-07-05 07:14:03 +02:00
rmg-x
ae983a2ef7 LibWeb: Add log_filtered_request() method in ResourceLoader
This will reduce log noise when visiting sites
that have a lot of filtered content.

Previously, red error text would be displayed in
the logs for each filtered URL.
2024-07-04 21:45:30 +01:00
Andreas Kling
847243b706 LibWeb: Only inject "User-Agent"/"Accept" headers when they're missing
...otherwise we send out HTTP requests with duplicates of these headers.
2024-06-19 08:09:10 +02:00
circl
c169e43e13 Userland: Remove some SerenityOS checks 2024-06-10 13:53:01 +02:00
Andreas Kling
260c5c50ad LibHTTP+RequestServer: Use HTTP::HeaderMap for request headers
No longer just for response headers! The same type is obviously useful
and ergonomic when making requests as well.
2024-06-09 15:34:02 +02:00
Andreas Kling
5ac0938859 LibHTTP+LibWeb: Stop bundling "Set-Cookie" headers as JSON
Before we had HTTP::HeaderMap (which preserves multiple headers with the
same name), we collected multiple "Set-Cookie" headers and bundled them
together as a JSON array.

This was a huge hack, and now we can stop doing that, since LibWeb gets
access to the full set of headers now.
2024-06-09 15:34:02 +02:00
Andreas Kling
e636851481 LibHTTP+RequestServer: Add HTTP::HeaderMap and use for response headers
Instead of using a HashMap<ByteString, ByteString, CaseInsensitive...>
everywhere, we now encapsulate this in a class.

Even better, the new class also allows keeping track of multiple headers
with the same name! This will make it possible for HTTP responses to
actually retain all their headers on the perilous journey from
RequestServer to LibWeb.
2024-06-09 15:34:02 +02:00
Andreas Kling
f2fd8fc928 Everywhere: Remove LibGemini
This hasn't been maintained (or worked at all) for a long time,
and it's not a widely supported protocol, so let's drop it.
2024-06-04 09:19:39 +02:00
Jamie Mansfield
295c4ef51a LibWeb/Fetch: Use MimeType in DataURL 2024-06-02 19:55:53 +02:00
Timothy Flynn
1e97ae66e5 LibWeb: Support unbuffered resource load requests
This adds an alternate API to ResourceLoader to load HTTP/HTTPS/Gemini
requests unbuffered. Most of the changes here are moving parts of the
existing ResourceLoader::load method to helper methods so they can be
re-used by the new ResourceLoader::load_unbuffered.
2024-05-26 18:29:24 +02:00
Timothy Flynn
168d28c15f LibProtocol+Userland: Support unbuffered protocol requests
LibWeb will need to use unbuffered requests to support server-sent
events. Connection for such events remain open and the remote end sends
data as HTTP bodies at its leisure. The browser needs to be able to
handle this data as it arrives, as the request essentially never
finishes.

To support this, this make Protocol::Request operate in one of two
modes: buffered or unbuffered. The existing mechanism for setting up a
buffered request was a bit awkward; you had to set specific callbacks,
but be sure not to set some others, and then set a flag. The new
mechanism is to set the mode and the callbacks that the mode needs in
one API.
2024-05-26 18:29:24 +02:00
Timothy Flynn
086ddd481d Ladybird+LibWeb: Move User-Agent definitions to their own file
This is to avoid including any LibProtocol header in Objective-C source
files, which will cause a conflict between the Protocol namespace and a
@Protocol interface.

See Ladybird/AppKit/Application/ApplicationBridge.cpp for why this
conflict unfortunately cannot be worked around.
2024-05-26 18:29:24 +02:00
Shannon Booth
d200704b6d LibWeb: Add Last-Modified header to file and resource requests 2024-04-02 07:51:02 +02:00
Andreas Kling
3d78f86a2f LibWeb: Don't ask RequestServer to prefetch DNS for file: and data: URLs 2024-03-26 11:39:45 +01:00
Timothy Flynn
24ecf31ff5 LibURL+LibWeb: Move data URL processing to LibWeb's fetch infrastructure
This is a fetching AO and is only used by LibWeb in the context of fetch
tasks. Move it to LibWeb with other fetch methods.

The main reason for this is that it requires the use of other LibWeb AOs
such as the forgiving Base64 decoder and MIME sniffing. These AOs aren't
available within LibURL.
2024-03-25 08:13:27 +01:00
Shannon Booth
e800605ad3 AK+LibURL: Move AK::URL into a new URL library
This URL library ends up being a relatively fundamental base library of
the system, as LibCore depends on LibURL.

This change has two main benefits:
 * Moving AK back more towards being an agnostic library that can
   be used between the kernel and userspace. URL has never really fit
   that description - and is not used in the kernel.
 * URL _should_ depend on LibUnicode, as it needs punnycode support.
   However, it's not really possible to do this inside of AK as it can't
   depend on any external library. This change brings us a little closer
   to being able to do that, but unfortunately we aren't there quite
   yet, as the code generators depend on LibCore.
2024-03-18 14:06:28 -04:00
Shannon Booth
9ce8189f21 Everywhere: Use unqualified AK::URL
Now possible in LibWeb now that there is no longer a Web::URL.
2024-02-25 08:54:31 +01:00
Andrew Kaster
e7daa02bf2 LibWeb: Unblock port 9000
This was blocked because it can be used for cross-protocol attacks on
some network printers. However, it's also used by the web platform
tests. One can argue that getting WPT working is more important than
theoretical attacks on poorly configured printers.
2024-02-12 11:43:22 -07:00
Bastiaan van der Plaat
cde14901bc Ladybird+LibWeb: Add initial about:version internal page 2024-01-13 13:41:09 -05:00
Bastiaan van der Plaat
05c0640474 Ladybird+LibWeb: Add about scheme support for internal pages 2024-01-13 13:41:09 -05:00
Andreas Kling
bf5ad56085 LibWeb: Ignore preconnect requests for file: and data: URLs
I noticed while debugging a fully downloaded page that it was trying
to preconnect to a file:// host. That doesn't make any sense, so let's
add a tiny bit of logic to ignore preconnect requests for file: and
data: URLs.
2023-12-30 13:49:50 +01:00
Bastiaan van der Plaat
f8feca5d21 LibWeb: Use directory page when viewing a resource schemed directory URL 2023-12-27 10:54:07 -05:00
Timothy Flynn
e511a264fe LibWeb: Implement ad-hoc steps to allow LibWeb to load resource:// URLs
The resource:// scheme is used for Core::Resource files. Currently, any
users of resource:// URLs in Ladybird must manually create the Resource
and extract its data. This will allow for passing the resource:// URL
along for LibWeb to handle.
2023-12-24 14:09:23 +01:00
Ali Mohammad Pur
5e1499d104 Everywhere: Rename {Deprecated => Byte}String
This commit un-deprecates DeprecatedString, and repurposes it as a byte
string.
As the null state has already been removed, there are no other
particularly hairy blockers in repurposing this type as a byte string
(what it _really_ is).

This commit is auto-generated:
  $ xs=$(ack -l \bDeprecatedString\b\|deprecated_string AK Userland \
    Meta Ports Ladybird Tests Kernel)
  $ perl -pie 's/\bDeprecatedString\b/ByteString/g;
    s/deprecated_string/byte_string/g' $xs
  $ clang-format --style=file -i \
    $(git diff --name-only | grep \.cpp\|\.h)
  $ gn format $(git ls-files '*.gn' '*.gni')
2023-12-17 18:25:10 +03:30
Aliaksandr Kalenik
6ac43274b2 LibWeb+LibJS: Use JS::GCPtr for pointers to GC-allocated objects
Fixes warnings found by LibJSGCVerifier
2023-12-11 16:55:25 +01:00
Andrew Kaster
04670c7a06 LibWeb: Hide load started/completed debug messages behind SPAM_DEBUG 2023-12-08 20:04:13 -05:00
Shannon Booth
6aff55d655 LibWeb: Port NavigatorID from DeprecatedString to String 2023-11-20 15:00:19 +01:00
Andrew Kaster
ea95256f83 LibWeb: Remove unused ResourceLoader::load(URL) overload
With the refactoring of Workers, nobody is calling this LoadRequest-less
overload. All new code should eventually be moved to fetch anyway.
2023-11-15 12:56:33 +01:00
Andrew Kaster
d7d84ee931 LibWeb: Ensure a Web::Page is associated with local Worker LoadRequests
This is a hack on top of a hack because Workers don't *really* need to
have a Web::Page at all, but the ResourceLoader infra that should be
going away soon ™️ is not quite ready to axe that requirement for
cookies.
2023-11-15 12:56:33 +01:00
Aliaksandr Kalenik
b9e0ad4358 LibWeb: Make ResourceLoader pass body and headers in error callback
Pass body and headers of a failed request to callback so caller can
process them.
2023-10-03 09:41:56 +02:00
Bastiaan van der Plaat
8f2319e966 Ladybird+LibWeb: Rename FileDirectoryLoader to GeneratedPagesLoader 2023-09-24 19:59:00 -06:00
Bastiaan van der Plaat
eafdb06d87 LibWeb: Add directory entries page when visiting a local directory 2023-08-15 10:41:54 +01:00
auipc
1653c5ea41 LibWeb: Use current platform for navigator.platform
Before, navigator.platform would always report the platform as "Serenity
OS", regardless of whether or not that was true. It also did not include
the architecture, which Firefox and Chrome both do. Now, it can report
either "Linux x86_64" or "SerenityOS AArch64".
2023-08-13 05:13:18 +02:00
Karol Kosek
eb41f0144b AK: Decode data URLs to separate class (and parse like every other URL)
Parsing 'data:' URLs took it's own route. It never set standard URL
fields like path, query or fragment (except for scheme) and instead
gave us separate methods called `data_payload()`, `data_mime_type()`,
and `data_payload_is_base64()`.

Because parsing 'data:' didn't use standard fields, running the
following JS code:

    new URL('#a', 'data:text/plain,hello').toString()

not only cleared the path as URLParser doesn't check for data from
data_payload() function (making the result be 'data:#a'), but it also
crashes the program because we forbid having an empty MIME type when we
serialize to string.

With this change, 'data:' URLs will be parsed like every other URLs.
To decode the 'data:' URL contents, one needs to call process_data_url()
on a URL, which will return a struct containing MIME type with already
decoded data! :^)
2023-08-01 14:19:05 +02:00
Karol Kosek
f27b9b9563 LibWeb: Set Content-Type for data: URLs instead of checking MIME on load
This makes the loader more agnostic.

Additionally, this allows us to load tab in Ladybird with a 'data:' URL
containing parameters, as a Resource will now call
`mime_type_from_content_type` to extract the content type from MIME. :^)
2023-08-01 14:19:05 +02:00
Daniel
d8c1150f6b LibWeb: Respect "no-store" directive in cache-control header 2023-06-22 21:24:23 +02:00
MacDue
35612c6a7f AK+Everywhere: Change URL::path() to serialize_path()
This now defaults to serializing the path with percent decoded segments
(which is what all callers expect), but has an option not to. This fixes
`file://` URLs with spaces in their paths.

The name has been changed to serialize_path() path to make it more clear
that this method will generate a new string each call (except for the
cannot_be_a_base_url() case). A few callers have then been updated to
avoid repeatedly calling this function.
2023-04-15 06:37:04 +02:00
Timothy Flynn
69e8216f2c LibWeb: Do not use OS error codes in the error callback for file:// URLs
The error code passed here is expected to be an HTTP error code. Passing
errno codes does not make sense in that context.
2023-04-04 22:41:20 +01:00
Andreas Kling
652676fdc1 LibWeb: Make ResourceLoader insert a Content-Type header for file://
We make a guess using the MIME type guessing API in LibCore. This frees
clients of this code from having to do the guessing.
2023-03-22 23:34:32 +00:00
Andreas Kling
20132da88d LibWeb: Don't treat erroring subresource loads as success
If a subresource fails to load, we don't care that we got some custom
404 page. The subresource should still be considered failed.

This is an ad-hoc solution that unbreaks Acid2. This code will
eventually be replaced by fetch mechanisms.
2023-03-15 23:29:00 +01:00
Andreas Kling
3435820e1f LibWeb: Render HTML content if present for HTTP error pages
If an HTTP response fails with an error code (e.g 403) but still has
body content, we now render the content.

We only fall back to our own built-in error page if there's no body.
2023-02-24 19:15:49 +01:00
Tim Schumacher
606a3982f3 LibCore: Move Stream-based file into the Core namespace 2023-02-13 00:50:07 +00:00
Tim Schumacher
d43a7eae54 LibCore: Rename File to DeprecatedFile
As usual, this removes many unused includes and moves used includes
further down the chain.
2023-02-13 00:50:07 +00:00
Timothy Flynn
4a916cd379 Everywhere: Remove needless copies of Error / ErrorOr instances
Either take the underlying objects with release_* methods or move() the
instances around.
2023-02-10 09:08:52 +00:00
Timothy Flynn
96f409ec1e LibWeb+WebContent: Do not reference-count file request objects
There is currently a memory leak with these file request objects due to
the callback on_file_request_finish referencing itself in its capture
list. This object does not need to be reference counted or allocated on
the heap. It is only ever stored in a HashMap until a response is
received from the browser, and it is not shared.
2023-02-01 14:04:44 +00:00
Luke Wilde
6d188d72c0 LibWeb: Store cookies for every HTTP response
As per Fetch, we are supposed to store cookies from Set-Cookie as soon
as we receive response headers for any HTTP response, even in error
cases.

Required by Twitter to login, as it sets cookies via XHR.
2022-12-30 21:56:54 -05:00
Tim Schumacher
ed4c2f2f8e LibCore: Rename Stream::read_all to read_until_eof
This generally seems like a better name, especially if we somehow also
need a better name for "read the entire buffer, but not the entire file"
somewhere down the line.
2022-12-12 14:16:42 +01:00
Linus Groh
57dc179b1f Everywhere: Rename to_{string => deprecated_string}() where applicable
This will make it easier to support both string types at the same time
while we convert code, and tracking down remaining uses.

One big exception is Value::to_string() in LibJS, where the name is
dictated by the ToString AO.
2022-12-06 08:54:33 +01:00