Commit graph

205 commits

Author SHA1 Message Date
Andreas Kling
022a49e9ac LibWeb: Only notify PageClient about top-level browsing context loads
We don't need to notify the web views that some deeply nested iframe
has started loading a new URL (and we don't want it showing up in the
browser location bar either!)
2022-09-25 12:28:40 +02:00
davidot
4912b22e3b LibWeb+WebContent: Setup the js console client earlier
This allows us to print messages in inline scripts. Also add an example
of this in the welcome page to test this.
2022-09-21 17:34:32 +01:00
Andreas Kling
b02402e116 LibWeb: Fix null dereference in ResourceClient::set_resource()
If resource_did_load() results in the ResourceClient being destroyed,
we can't dereference the weak ResourceClient right after.
2022-09-21 11:51:18 +02:00
Andreas Kling
92deba7197 LibWeb: Implement Document/BrowsingContext hookup according to spec
We now implement the browsing context's "set active document" algorithm
from the spec, as well as the "discard" algorithm for browsing contexts
and documents.
2022-09-20 23:44:59 +02:00
Andreas Kling
3df9861814 LibWeb: Capture self as a WeakPtr in ResourceClient::set_resource()
It's not safe to capture `this` as a raw pointer here, since nothing
is guaranteed to keep the ResourceClient alive (even if the Resource
stays alive.)
2022-09-18 02:15:01 +02:00
Andreas Kling
cd7262ee56 LibWeb+LibWebView+WebContent: Add Web::Platform::ImageCodecPlugin
This replaces the previous Web::ImageDecoding::Decoder interface.
While we're doing this, also move the SerenityOS implementation of this
interface from LibWebView to WebContent. That means we no longer have to
link with LibImageDecoderClient in applications that use a web view.
2022-09-16 15:15:50 +02:00
Andreas Kling
9567e211e7 LibWeb+WebContent: Add abstraction layer for event loop and timers
Instead of using Core::EventLoop and Core::Timer directly, LibWeb now
goes through a Web::Platform abstraction layer instead.

This will allow us to plug in Qt's event loop (and QTimer) over in
Ladybird, to avoid having to deal with multiple event loops.
2022-09-07 20:30:31 +02:00
Andreas Kling
6f433c8656 LibWeb+LibJS: Make the EventTarget hierarchy (incl. DOM) GC-allocated
This is a monster patch that turns all EventTargets into GC-allocated
PlatformObjects. Their C++ wrapper classes are removed, and the LibJS
garbage collector is now responsible for their lifetimes.

There's a fair amount of hacks and band-aids in this patch, and we'll
have a lot of cleanup to do after this.
2022-09-06 00:27:09 +02:00
MacDue
8d2c2f7c52 LibWeb: Determine the origin when navigating across documents 2022-08-26 00:21:10 +02:00
Andreas Kling
602f927982 LibWeb: Start implementing "create and initialize a Document" from HTML
The way we've been creating DOM::Document has been pretty far from what
the spec tells us to do, and this is a first big step towards getting us
closer to spec.

The new Document::create_and_initialize() is called by FrameLoader after
loading a "text/html" resource.

We create the JS Realm and the Window object when creating the Document
(previously, we'd do it on first access to Document::interpreter().)

The realm execution context is owned by the Environment Settings Object.
2022-08-05 12:46:40 +02:00
Undefine
97cc33ca47 Everywhere: Make the codebase more architecture aware 2022-07-27 21:46:42 +00:00
Andreas Kling
c964a6b548 LibWeb: Paper over a VERIFY() crash in ResourceLoader for now 2022-07-17 14:11:36 +02:00
sin-ack
3f3f45580a Everywhere: Add sv suffix to strings relying on StringView(char const*)
Each of these strings would previously rely on StringView's char const*
constructor overload, which would call __builtin_strlen on the string.
Since we now have operator ""sv, we can replace these with much simpler
versions. This opens the door to being able to remove
StringView(char const*).

No functional changes.
2022-07-12 23:11:35 +02:00
Kenneth Myhra
92a3803066 LibWeb: Add timeout_callback to ResourceLoader::load() 2022-07-03 13:26:32 +02:00
Kenneth Myhra
07b6c7114b LibWeb: Use a single shot timer instead of an ordinary repetitive timer 2022-07-03 13:26:32 +02:00
Lucas CHOLLET
662711fa26 Browser+LibWeb+WebContent: Allow Browser to load local files
To achieve this goal:
 - The Browser unveils "/tmp/portal/filesystemaccess"
 - Pass the page through LoadRequest => ResourceLoader
 - ResourceLoader requests a file to the FileSystemAccessServer via IPC
 - OutOfProcessWebView handles it and sends a file descriptor back to
 the Page.
2022-06-27 20:22:15 +01:00
Andreas Kling
c03a0e7260 LibWeb: Fix unsafe capture of ref-to-local when setting up load timeout
We were capturing a reference to a stack local and then persisting the
closure, causing it to dereference a long-gone object when invoked.
2022-06-23 20:37:29 +02:00
Kenneth Myhra
c805987329 LibWeb: Add timeout functionality to ResourceLoader
Add timeout functionality to ResourceLoader and use it from
XMLHttpRequest.
2022-06-21 10:29:14 +01:00
Luke Wilde
210c3795f9 LibWeb: Apply content filter to DNS prefetch and pre-connect
Performing DNS prefetch or pre-connect on filtered URLs is wasteful,
as we would block any actual use further down the line.

A bunch of websites perform DNS prefetch and/or pre-connect to trackers
as well, for example:
```
prefetch DNS for 'https://adserver-us.adtech.advertising.com/'
prefetch DNS for 'https://secure.adnxs.com/'
prefetch DNS for 'https://bidder.criteo.com/'
prefetch DNS for 'https://static.criteo.net/'
prefetch DNS for 'https://cdn.krxd.net/'
prefetch DNS for 'https://widgets.outbrain.com/'
prefetch DNS for 'https://images.outbrain.com/'
prefetch DNS for 'https://log.outbrain.com/
prefetch DNS for 'https://amplifypixel.outbrain.com/'
prefetch DNS for 'https://odb.outbrain.com/'
prefetch DNS for 'https://js-sec.indexww.com/'
prefetch DNS for 'https://as-sec.casalemedia.com/'
prefetch DNS for 'https://as.casalemedia.com/'
prefetch DNS for 'https://sofia.trustx.org/'
prefetch DNS for 'https://c.amazon-adsystem.com/'
prefetch DNS for 'https://s.amazon-adsystem.com/'
prefetch DNS for 'https://aax.amazon-adsystem.com/'
prefetch DNS for 'https://t.teads.tv/'
prefetch DNS for 'https://beacon.krxd.net/'
pre-connect to 'https://www.google-analytics.com/'
pre-connect to 'https://www.googletagmanager.com/'
```
2022-06-10 12:15:37 +01:00
DexesTTP
bf6c4835e6 LibWeb: Allow configuring the default error page path 2022-05-29 23:00:04 +01:00
DexesTTP
26bb95425d LibWeb: Allow configuring the default favicon path
This is useful when using LibWeb in environments that aren't Serenity
2022-05-29 23:00:04 +01:00
Michiel Visser
7278ad761e LibHTTP+LibWeb: Accept Brotli encoded responses 2022-05-21 22:41:40 +02:00
DexesTTP
c00ae53b66 LibWeb: Abstract the LibProtocol ResourceLoader connection
This is the final component that required LibProtocol as a dependency
of LibWeb. With this, we can now remove the dependency, and LibWeb no
longer requires IPC to work :^)
2022-05-15 12:17:36 +02:00
DexesTTP
2198091bbc LibWeb: Abstract the image decoding via Web::ImageDecoding::Decoder
After this change, LibWeb now expects Web::ImageDecoding::Decoder to be
pre-initialized with a concrete implementation before using the webpage.
The previous implementation, based on the ImageDecoder service, has been
provided directly through an adapter in LibWebClient, and is now used as
the default value by WebContent.
2022-05-15 12:17:36 +02:00
Sam Atkins
f8950ea846 LibWeb: Don't treat any empty resources as errors
HTML, CSS, JS and text files (among other things) can all legitimately
be empty. Other types may be invalid, but that will be caught when
trying to parse it as a document, so this check can safely be removed.
2022-05-13 17:12:39 +02:00
Sam Atkins
1f82beded3 LibWeb: Make about:blank load correctly
- Don't treat an empty `about:blank` resource as an error.
- Give `about:` urls a content-type so `FrameLoader::parse_document()`
  won't reject them.
2022-05-13 16:25:33 +02:00
DexesTTP
6027ab9e12 LibWeb: Only generate ResourceLoader signposts while on Serenity 2022-05-06 14:11:03 +02:00
Anthony Van de Gejuchte
69ca27d3d7 LibWeb: Show correct favicon when default favicon is loaded
Block the replacement of the favicon by the default favicon loader
when a favicon that is loaded through a link tag is already active.

This way, the favicon in the link tags will be prioritized against
the default favicons from `/favicon.ico` or the seranity default icon.
2022-04-10 12:10:59 +02:00
Andreas Kling
9ed5a14af2 LibWeb: Remove unused ResourceLoader::load_sync()
There are no remaining users of this API, everyone has been migrated
to asynchronous resource loading.
2022-04-10 01:37:45 +02:00
Andreas Kling
f21eb90294 LibWeb: Remove debug spam about proxy configuration lookups 2022-04-09 14:50:05 +02:00
Ali Mohammad Pur
a42e03b01a Browser+LibWeb+WebContent: Implement per-URL-pattern proxies
...at least for SOCKS5.
2022-04-09 12:21:43 +02:00
Andreas Kling
0f6e1f7a32 LibWeb: Make BrowsingContext ask PageClient when it wants to be scrolled
BrowsingContext shouldn't be scrolling itself, instead it has to update
the layout (to ensure that we have current document metrics, and then
ask the PageClient nicely to scroll it.

This fixes an issue where BrowsingContext sometimes believed itself to
be scrolled, but OOPWV had a different idea.
2022-04-06 19:35:08 +02:00
Andreas Kling
0eae88f613 LibWeb: Use FrameLoader::load_html() when loading error pages
Use our existing helper function for parsing a HTML string and opening
it as the main content of the attached browsing context.
2022-04-06 19:35:07 +02:00
Idan Horowitz
086969277e Everywhere: Run clang-format 2022-04-01 21:24:45 +01:00
Ali Mohammad Pur
5a0123fd2f LibWeb: Load X(HT)ML documents and transform them into HTML DOM 2022-03-28 23:11:48 +02:00
Simon Wanner
3d80d38954 LibWeb: Emit signposts for resource loads 2022-03-24 14:35:47 +01:00
Timothy Flynn
20eb441cba LibWeb: Evict replaced Resource objects from cache
When a Resource is converted to an ImageResource, evict the original
resource from cache. The original resource's data has been moved, so on
a warm reload of a page, when that resource is loaded from cache, it
would not have any data to actually show.
2022-03-23 21:26:35 +01:00
Timothy Flynn
90829fe880 LibWeb: Allow HTMLObjectElement to convert a Resource to ImageResource
HTMLObjectElement, when implemented according to the spec, does not know
the resource type specified by the 'data' attribute until after it has
actually loaded (i.e. it may be an image, XML document, etc.). Currently
we always use ImageLoader within HTMLObjectElement to load the object,
but will need to use ResourceLoader instead to generically load data.

However, ImageLoader / ImageResource have image-specific functionality
that HTMLObjectElement still needs if the resource turns out to be an
image. This patch will allow (only) HTMLObjectElement to convert the
generic Resource to an ImageResource as needed.
2022-03-23 13:44:51 +01:00
Simon Wanner
1d55437a76 LibWeb: Ignore invalid encodings in Content-Type headers 2022-03-21 10:47:46 +01:00
Andreas Kling
8f1a3f7878 LibWeb: Tweak our User-Agent string
- Switch from "Mozilla/4.0" to "Mozilla/5.0" to match other browsers.
- Remove references to KHTML and Gecko.
- Identify ourselves as "LibWeb+LibJS/1.0 Browser/1.0"

New UA: "Mozilla/5.0 (SerenityOS; x86_64) LibWeb+LibJS/1.0 Browser/1.0"
2022-03-20 23:16:22 +01:00
Andreas Kling
d03680a9e7 LibWeb: Always defer callbacks in ResourceClient::set_resource()
Previously, we'd invoke the load/fail callbacks synchronously for
resources that were already loaded and cached.

This patch uses deferred_invoke() in the already-loaded case to ensure
that we always invoke these callbacks in a consistent manner.

This isn't to fix a specific issue, but rather because I kept seeing
these callbacks being invoked synchronously on top of an already-tall
call stack, and it was hard to reason about what was going on.
2022-03-20 19:03:43 +01:00
Lenny Maiorani
c37820b898 Libraries: Use default constructors/destructors in LibWeb
https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#cother-other-default-operation-rules

"The compiler is more likely to get the default semantics right and
you cannot implement these functions better than the compiler."
2022-03-17 17:23:49 +00:00
Andreas Kling
252ed8ad18 LibWeb: Fail resource loads on HTTP 4xx or 5xx error
This fixes an issue on ACID3 where failing image loads with body content
would still get displayed.
2022-03-09 16:43:00 +01:00
Andreas Kling
fe67fe3791 LibWeb: Check for valid names in Document.createElement() & friends
We now validate that the provided tag names are valid XML tag names,
and otherwise throw an "invalid character" DOM exception.

2% progression on ACID3. :^)
2022-02-26 10:03:07 +01:00
Andreas Kling
8b2499b112 LibWeb: Make document.write() work while document is parsing
This necessitated making HTMLParser ref-counted, and having it register
itself with Document when created. That makes it possible for scripts to
add new input at the current parser insertion point.

There is now a reference cycle between Document and HTMLParser. This
cycle is explicitly broken by calling Document::detach_parser() at the
end of HTMLParser::run().

This is a huge progression on ACID3, from 31% to 49%! :^)
2022-02-21 22:00:28 +01:00
Luke Wilde
b717f7065e LibWeb: Send appropriate Accept header for FrameLoader requests
According to Fetch, we must send an Accept header with the value
"text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"
for document, iframe and frame requests.

https://fetch.spec.whatwg.org/#concept-fetch

Required by uber.com.
2022-02-18 01:46:45 +01:00
Andreas Kling
0e2cd5540a LibWeb: Follow HTTP 3xx redirections when loading images
This basically copies some logic from FrameLoader to ImageLoader.
Ideally we'd share this code, but for now let's just get redirected
images to show up. :^)
2022-02-16 22:21:45 +01:00
Idan Horowitz
497dd5b354 LibWeb: Set response header cookies on redirects
Since we were previously relying on Document::set_cookie in order to
set cookies received as a 'Set-Cookie' response header, we would ignore
any response header cookies in redirect (status code 3xx) responses.

While this behaviour is not strictly enforced in the specification,
most major browsers do set cookies in redirect responses, and some
sites (e.g. Cookie Clicker) rely on this behaviour.

Since cookies are stored per-site and not per-document, this behaviour
is achieved by simply decoupling the cookie set mechanism from it.
2022-02-12 16:15:56 +00:00
Idan Horowitz
721a4a0a67 LibWeb: Ignore Location headers unless the response status code is 3xx
As per RFC7231 the Location header field has different meanings for
different response status codes:
For 201 (Created) responses, the Location value refers to the primary
resource created by the request.
For 3xx (Redirection) responses, the Location value refers to the
preferred target resource for automatically redirecting the request.
2022-02-12 16:15:56 +00:00
Andreas Kling
378bca8b0c LibWeb: Make debug logging of resource load errors red instead of green
Red is a bit more suspicious than green, after all. :^)
2022-02-04 00:16:25 +01:00