Start work on a speculative HTML Parser in Swift. This component will
walk ahead of the normal HTML parser looking for fetch() requests to
make while the normal parser is blocked. This work exposed many holes in
the Swift C++ interop component, which have been reported upstream.
This commit:
- Prevents path traversal via the about: scheme
- Prevents loading about:inspector
- Requires about: URIs to be opaque paths
- Prevents crashes with invalid percent encoded paths
We only need a Page for file:// urls. At some point we probably
needed it for other kinds of requests, but the current functionality
doesn't need to store the Page pointer on the ResourceLoader.
Resulting in a massive rename across almost everywhere! Alongside the
namespace change, we now have the following names:
* JS::NonnullGCPtr -> GC::Ref
* JS::GCPtr -> GC::Ptr
* JS::HeapFunction -> GC::Function
* JS::CellImpl -> GC::Cell
* JS::Handle -> GC::Root
These changes are arbitrarily divided into multiple commits to make it
easier to find potentially introduced bugs with git bisect.Everything:
The modifications in this commit were automatically made using the
following command:
find . -name '*.cpp' -exec sed -i -E 's/dbg\(\) << ("[^"{]*");/dbgln\(\1\);/' {} \;
This patch adds a global (per-process) filter list to LibWeb that is
used to filter all outgoing resource load requests.
Basically we check the URL against a list of filter patterns and if
it's a match for any one of them, we immediately fail the load.
The filter list is a simple text file:
~/.config/BrowserContentFilters.txt
It's one filter per line and they are simple glob filters for now,
with implicit asterisks (*) at the start and end of the line.
This patchset makes ProtocolServer stream the downloads to its client
(LibProtocol), and as such changes the download API; a possible
download lifecycle could be as such:
notation = client->server:'>', server->client:'<', pipe activity:'*'
```
> StartDownload(GET, url, headers, {})
< Response(0, fd 8)
* {data, 1024b}
< HeadersBecameAvailable(0, response_headers, 200)
< DownloadProgress(0, 4K, 1024)
* {data, 1024b}
* {data, 1024b}
< DownloadProgress(0, 4K, 2048)
* {data, 1024b}
< DownloadProgress(0, 4K, 1024)
< DownloadFinished(0, true, 4K)
```
Since managing the received file descriptor is a pain, LibProtocol
implements `Download::stream_into(OutputStream)`, which can be used to
stream the download into any given output stream (be it a file, or
memory, or writing stuff with a delay, etc.).
Also, as some of the users of this API require all the downloaded data
upfront, LibProtocol also implements `set_should_buffer_all_input()`,
which causes the download instance to buffer all the data until the
download is complete, and to call the `on_buffered_download_finish`
hook.
Now that we attach the document to the frame before parsing, we have
to make sure we set the encoding on the document before parsing, or
things may not turn out well.
FrameLoader now begins by constructing a DOM::Document, and then builds
a document tree inside it based on the MIME type. For text/html we pass
control to the HTMLDocumentParser as before.
This gives us access to things like window.alert() during parsing.
Fixes#3973.
This isn't entirely symmetrical with on_load_start as it will also fire
on reloads and back/forward navigations. However, it's good enough for
some basic use cases, and we can do more sophisticated notifications
later on when we need them.
This patch makes Page weakable and allows page-less frames to exist.
Page is single-owner, and Frame is multiple-owner, so it's not sound
for Frame to assume its containing Page will stick around for its own
entire lifetime.
Fixes#3976.
Ref-counted objects must not be stack allocated. Make DOM::Document's
constructor private to avoid this issue. (I wish we could mark classes
as heap-only..)
This reverts my previous commit in WebServer and fixes the whole issue
in a much better way. Instead of having the MIME type guesser take a
URL (which we don't actually have in the WebServer at that point),
just take a path as a StringView.
Also, make use of the case-insensitive StringView::ends_with() :^)