Commit graph

19 commits

Author SHA1 Message Date
Timothy Flynn
d265575269 AK: Add a Base64 decoder to decode into an existing buffer
Some callers (LibJS) will want to control the size of the output buffer,
to decode up to a maximum length. They will also want to receive partial
results in the case of an error. This patch adds a method to provide
those capabilities, and makes the existing implementation use it.
2024-09-03 17:43:03 +02:00
Timothy Flynn
41e14e3fc3 AK: Add an option to the base64 encoder to omit padding
Will be used by an upcoming JS prototype
2024-09-03 17:43:03 +02:00
Timothy Flynn
bfc9dc447f AK+LibWeb: Replace our home-grown base64 encoder/decoders with simdutf
We currently have 2 base64 coders: one in AK, another in LibWeb for a
"forgiving" implementation. ECMA-262 has an upcoming proposal which will
require a third implementation.

Instead, let's use the base64 implementation that is used by Node.js and
recommended by the upcoming proposal. It handles forgiving decoding as
well.

Our users of AK's implementation should be fine with the forgiving
implementation. The AK impl originally had naive forgiving behavior, but
that was removed solely for performance reasons.

Using http://mattmahoney.net/dc/enwik8.zip (100MB unzipped) as a test,
performance of our old home-grown implementations vs. the simdutf
implementation (on Linux x64):

                Encode    Decode
AK base64       0.226s    0.169s
LibWeb base64   N/A       1.244s
simdutf         0.161s    0.047s
2024-07-16 10:27:39 +02:00
Timothy Flynn
7e38653492 AK: Reject invalid Base64 encoded string lengths 2024-03-25 08:13:27 +01:00
Timothy Flynn
754ff41b9c AK: Remove whitespace skipping feature from AK's Base64 decoder
This was added in commit f2663f477f as a
partial implementation of what is now LibWeb's forgiving Base64 decoder.
All use cases within LibWeb that require whitespace skipping now use
that implementation instead.

Removing this feature from AK allows us to know the exact output size of
a decoded Base64 string. We can still trim whitespace at the start and
end of the input though; for example, this is useful when reading from a
file that may have a newline at the end of the file.
2024-03-25 08:13:27 +01:00
Andrew Kaster
e9b16970fe AK: Add base64url encoding and decoding methods
This encoding scheme comes from section 5 of RFC 4648, as an
alternative to the standard base64 encode/decode methods.

The only difference is that the last two characters are replaced
with '-' and '_', as '+' and '/' are not safe in URLs or filenames.
2024-03-20 12:18:57 -04:00
Ali Mohammad Pur
5e1499d104 Everywhere: Rename {Deprecated => Byte}String
This commit un-deprecates DeprecatedString, and repurposes it as a byte
string.
As the null state has already been removed, there are no other
particularly hairy blockers in repurposing this type as a byte string
(what it _really_ is).

This commit is auto-generated:
  $ xs=$(ack -l \bDeprecatedString\b\|deprecated_string AK Userland \
    Meta Ports Ladybird Tests Kernel)
  $ perl -pie 's/\bDeprecatedString\b/ByteString/g;
    s/deprecated_string/byte_string/g' $xs
  $ clang-format --style=file -i \
    $(git diff --name-only | grep \.cpp\|\.h)
  $ gn format $(git ls-files '*.gn' '*.gni')
2023-12-17 18:25:10 +03:30
Ben Wiederhake
f890b70eae Tests: Prefer TRY_OR_FAIL() and MUST() over EXPECT(!.is_error())
Note that in some cases (in particular SQL::Result and PDFErrorOr),
there is no Formatter defined for the error type, hence TRY_OR_FAIL
cannot work as-is. Furthermore, this commit leaves untouched the places
where MUST could be replaced by TRY_OR_FAIL.

Inspired by:
https://github.com/SerenityOS/serenity/pull/18710#discussion_r1186892445
2023-05-14 15:39:38 -06:00
Jelle Raaijmakers
25f2e4981c AK: Stop using DeprecatedString in Base64 encoding 2022-12-20 10:34:19 +01:00
Linus Groh
6e19ab2bbc AK+Everywhere: Rename String to DeprecatedString
We have a new, improved string type coming up in AK (OOM aware, no null
state), and while it's going to use UTF-8, the name UTF8String is a
mouthful - so let's free up the String name by renaming the existing
class.
Making the old one have an annoying name will hopefully also help with
quick adoption :^)
2022-12-06 08:54:33 +01:00
sin-ack
f6b1db37fc Tests: Convert TestBase64 decode test to use StringViews directly
Previously it would rely on the implicit StringView conversions. Now the
decode_equal function will directly use StringViews.
2022-07-12 23:11:35 +02:00
Idan Horowitz
086969277e Everywhere: Run clang-format 2022-04-01 21:24:45 +01:00
Andreas Kling
f2663f477f AK: Ignore whitespace while decoding base64
This matches how other implementations behave.

1% progression on ACID3. :^)
2022-02-25 19:54:13 +01:00
Sam Atkins
c388a879d7 AK+Userland: Make AK::decode_base64 return ErrorOr 2022-01-24 22:36:09 +01:00
Ben Wiederhake
768b70cc4d AK+Tests: Avoid implicitly copying ByteBuffer 2021-12-08 09:46:13 -08:00
Ben Wiederhake
cb868cfa41 AK+Everywhere: Make Base64 decoding fallible 2021-10-23 19:16:40 +01:00
Ben Wiederhake
3bf1f7ae87 AK: Don't crash on invalid Base64 input
In the long-term, we should probably have a way to signal decoding
failure. For now, it should suffice to at least not crash. This is
particularly relevant because apparently this can be triggered while
parsing a PEM certificate, which happens during every TLS connection.

Found by OSS Fuzz
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=38979
2021-10-23 19:16:40 +01:00
Brian Gianforcaro
d1644c26d6 Tests: Remove unused header includes 2021-08-01 08:10:16 +02:00
Brian Gianforcaro
67322b0702 Tests: Move AK tests to Tests/AK 2021-05-06 17:54:28 +02:00
Renamed from AK/Tests/TestBase64.cpp (Browse further)