Commit graph

20 commits

Author SHA1 Message Date
Timothy Flynn
e74d2b1762 LibWeb+LibWebView: Set the default path for invalid cookie Path values
We were missing this spec step when parsing the Path attribute.
2024-09-19 00:01:56 +01:00
Timothy Flynn
fce003a8f5 LibWeb+LibWebView: Implement the latest cookie draft RFC
We currently implement the official cookie RFC, which was last updated
in 2011. Unfortunately, web reality conflicts with the RFC. For example,
all of the major browsers allow nameless cookies, which the RFC forbids.

There has since been draft versions of the RFC published to address such
issues. This patch implements the latest draft.

Major differences include:
* Allowing nameless or valueless (but not both) cookies
* Formal cookie length limits
* Formal same-site rules (not fully implemented here)
* More rules around cookie domains
2024-09-17 00:04:33 +01:00
Timothy Flynn
693af180dd LibWebView: Ensure manually expired cookies are purged from the database
Cookies are typically deleted by setting their expiry time to an ancient
time stamp (i.e. this is how WebDriver is required to delete cookies).

Previously, we would update the cookie in the cookie jar, which would
mark the cookie as dirty. We would then purge expired cookies from the
jar's transient storage, which removed the cookie from the dirty list.
If the cookie was also in the persisted storage, it would never become
expired there as it was no longer in the dirty list when the timer for
synchronization fired.

Now, we don't remove any cookies from the transient dirty list when we
purge expired cookies. We hold onto the dirty cookie until sync time,
where we now update the cookie in the persisted storage *before* we
delete expired cookies.
2024-09-07 11:10:27 +02:00
Shannon Booth
cc55732332 LibURL+Everywhere: Only percent decode URL paths when actually needed
Web specs do not return through javascript percent decoded URL path
components - but we were doing this in a number of places due to the
default behaviour of URL::serialize_path.

Since percent encoded URL paths may not contain valid UTF-8 - this was
resulting in us crashing in these places.

For example - on an HTMLAnchorElement when retrieving the pathname for
the URL of:

http://ladybird.org/foo%C2%91%91

To fix this make the URL class only return the percent encoded
serialized path, matching the URL spec. When the decoded path is
required instead explicitly call URL::percent_decode.

This fixes a crash running WPT URL tests for the anchor element on:

https://wpt.live/url/a-element.html
2024-08-05 09:58:13 +02:00
Andrew Kaster
28093fecae LibWebView+WebContent: Prefix AK::Duration with AK Namespace 2024-07-18 09:43:38 +01:00
Timothy Flynn
30e745ffa7 LibWebView: Replace usage of LibSQL with sqlite3
This makes WebView::Database wrap around sqlite3 instead of LibSQL. The
effect on outside callers is pretty minimal. The main consequences are:

1. We must ensure the Cookie table exists before preparing any SQL
   statements involving that table.
2. We can use an INSERT OR REPLACE statement instead of separate INSERT
   and UPDATE statements.
2024-06-06 11:27:03 -04:00
Timothy Flynn
398ae75f9a Ladybird+LibWebView: Introduce a cache for cookies backed by SQL storage
Now that the chrome process is a singleton on all platforms, we can
safely add a cache to the CookieJar to greatly speed up access. The way
this works is we read all cookies upfront from the database. As cookies
are updated by the web, we store a list of "dirty" cookies that need to
be flushed to the database. We do that synchronization every 30 seconds
and at shutdown.

There's plenty of room for improvement here, some of which is marked
with FIXMEs in the CookieJar.

Before these changes, in a SQL database populated with 300 cookies,
browsing to https://twinings.co.uk/ WebContent spent:

    19,806ms waiting for a get-cookie response
    505ms waiting for a set-cookie response

With these changes, it spends:

    24ms waiting for a get-cookie response
    15ms waiting for a set-cookie response
2024-05-01 07:06:26 +02:00
Timothy Flynn
306041f4ac LibWebView: Do not update cookie access time when fetched with WebDriver
When WebDriver accesses cookies, it specifically says to run:

    the first step of the algorithm in RFC6265 to compute cookie-string

So we should skip subsequent steps. We already skip step 2, which sorts
the cookies, but neglected to skip step 3 to update their last access
time.
2024-04-21 14:46:54 -04:00
Shannon Booth
e800605ad3 AK+LibURL: Move AK::URL into a new URL library
This URL library ends up being a relatively fundamental base library of
the system, as LibCore depends on LibURL.

This change has two main benefits:
 * Moving AK back more towards being an agnostic library that can
   be used between the kernel and userspace. URL has never really fit
   that description - and is not used in the kernel.
 * URL _should_ depend on LibUnicode, as it needs punnycode support.
   However, it's not really possible to do this inside of AK as it can't
   depend on any external library. This change brings us a little closer
   to being able to do that, but unfortunately we aren't there quite
   yet, as the code generators depend on LibCore.
2024-03-18 14:06:28 -04:00
Timothy Flynn
5a20353bc4 LibWebView: Ensure we resolve cookie promises upon early returns
Note no test here, because this early return involves HTTP-only cookies,
which we don't have the infrastructure to test (we would need to support
custom HTTP headers in tests).
2024-03-06 14:38:49 -05:00
Timothy Flynn
f1d6693990 LibWebView: Reduce overhead of updating a cookie's last access time
Getting a document's cookie value currently involves:

1. Doing a large SELECT statement and filtering the results to match
   the document and some query parameters based on the cookie RFC.
2. For every cookie selected this way, doing an UPDATE to set its last
   access time.
3. For every UPDATE, do a DELETE to remove all expired cookies.

There's no need to perform cookie expiration for every UPDATE. Instead,
we can do the expiration once after all the UPDATEs are complete.

This reduces time spent waiting for cookies on https://twinings.co.uk
from ~1.9s to ~1.3s on my machine.
2024-02-26 19:59:09 +01:00
Timothy Flynn
83d2f59e2a LibWebView: Port the CookieJar to String 2024-01-26 20:22:39 +01:00
Timothy Flynn
85b8971a80 Ladybird+LibWeb+WebContent: Port the did_request_cookie IPC to String 2024-01-26 20:22:39 +01:00
Timothy Flynn
8ea4e37c27 LibWebView: Convert trivial ByteString uses to StringView in CookieJar 2024-01-26 20:22:39 +01:00
Timothy Flynn
5c5cbeb491 LibSQL+LibWebView: Do not manually serialize time stamps in CookieJar
LibSQL supports serializing time stamps as of commit
effcd080ca.

However, that commit serializes the timestamps as milliseconds, whereas
the CookieJar was serializing them as seconds. In retrospect, these
should have been updated in unison, along with the SQL heap version (as
this is a serialization change that affects the file format). So this
patch also updates the version, as this is not a backwards compatible
change.
2024-01-10 23:26:40 +01:00
Ali Mohammad Pur
5e1499d104 Everywhere: Rename {Deprecated => Byte}String
This commit un-deprecates DeprecatedString, and repurposes it as a byte
string.
As the null state has already been removed, there are no other
particularly hairy blockers in repurposing this type as a byte string
(what it _really_ is).

This commit is auto-generated:
  $ xs=$(ack -l \bDeprecatedString\b\|deprecated_string AK Userland \
    Meta Ports Ladybird Tests Kernel)
  $ perl -pie 's/\bDeprecatedString\b/ByteString/g;
    s/deprecated_string/byte_string/g' $xs
  $ clang-format --style=file -i \
    $(git diff --name-only | grep \.cpp\|\.h)
  $ gn format $(git ls-files '*.gn' '*.gni')
2023-12-17 18:25:10 +03:30
Shannon Booth
1b05598cd3 LibWeb: Port ParsedCookie from DeprecatedString to String 2023-11-28 17:15:27 -05:00
Shannon Booth
e28fb5c64c LibWeb: Port Cookie from DeprecatedString to String 2023-11-20 15:00:19 +01:00
Timothy Flynn
a39eebeb74 LibWebView: Reject cookies whose domain is on the Public Suffix List 2023-10-26 11:06:49 +02:00
Timothy Flynn
5c5a00dd3a Ladybird+LibWebView: Move CookieJar, Database, and History to LibWebView
These classes are used as-is in all chromes. Move them to LibWebView so
that non-Serenity chromes don't have to awkwardly reach into its headers
and sources.
2023-08-31 19:19:45 +02:00
Renamed from Userland/Applications/Browser/CookieJar.cpp (Browse further)