Commit graph

3793 commits

Author SHA1 Message Date
Ali Mohammad Pur
76f5dce3db LibRegex: Flatten capture group list in MatchState
This makes copying the capture group COWVector significantly cheaper,
as we no longer have to run any constructors for it - just memcpy.
2025-04-18 17:09:27 +02:00
Andrew Kaster
ad00306daf AK: Disallow constness laundering in RefPtr and NonnullRefPtr
This is a re-application of 3c7a0ef1ac

Co-Authored-By: Andreas Kling <andreas@ladybird.org>
2025-04-16 10:41:44 -06:00
Andrew Kaster
703abac9c8 AK: Add const_cast escape hatch for converting const T& to RefPtr<T>
There are parts of the codebase where properly const-correctifying the
the code would be a giant spaghetti mess, so add this loud workaround
to defer the refactoring for later.
2025-04-16 10:41:44 -06:00
Andreas Kling
0c93a07fb1 AK: Shrink Utf16View
Use a sentinel value instead of Optional for the cached length in code
points, shrinking Utf16View from 32 to 24 bytes.
2025-04-16 10:04:50 +02:00
Andreas Kling
7628ddfaf7 AK: Remove endianness override from Utf16View
Utf16View is now always in "host" endian mode. This makes it smaller
and less branchy for everyone!
2025-04-16 10:04:50 +02:00
Andreas Kling
0e9480b944 AK+LibTextCodec: Stop using Utf16View endianness override
This is preparation for removing the endianness override, since it was
only used by a single client: LibTextCodec.

While here, add helpers and make use of simdutf for fast conversion.
2025-04-16 10:04:50 +02:00
Aliaksandr Kalenik
247f7c5fcc AK: Add peek_some() to AllocatingMemoryStream
Same as read_some(), but doesn't move read position.
2025-04-15 18:48:53 +02:00
Andrew Kaster
5e7e6475c6 AK: Annotate [[no_unique_address]] members with NO_UNIQUE_ADDRESS macro 2025-04-15 02:19:06 -06:00
Andrew Kaster
864ddfb55d AK: Add macro to switch between no_unique_address attribute forms
Also add an escape hatch for completely disabling the annotation.
2025-04-15 02:19:06 -06:00
Andreas Kling
87ec5b32b0 LibRegex: Use ReadonlySpan to peek into OpCode_Compare LUTs
Some checks are pending
CI / Lagom (x86_64, Fuzzers_CI, false, ubuntu-24.04, Linux, Clang) (push) Waiting to run
CI / Lagom (arm64, Sanitizer_CI, false, macos-15, macOS, Clang) (push) Waiting to run
CI / Lagom (x86_64, Sanitizer_CI, false, ubuntu-24.04, Linux, GNU) (push) Waiting to run
CI / Lagom (x86_64, Sanitizer_CI, true, ubuntu-24.04, Linux, Clang) (push) Waiting to run
Package the js repl as a binary artifact / build-and-package (arm64, macos-15, macOS, macOS-universal2) (push) Waiting to run
Package the js repl as a binary artifact / build-and-package (x86_64, ubuntu-24.04, Linux, Linux-x86_64) (push) Waiting to run
Run test262 and test-wasm / run_and_update_results (push) Waiting to run
Lint Code / lint (push) Waiting to run
Label PRs with merge conflicts / auto-labeler (push) Waiting to run
Push notes / build (push) Waiting to run
By the time we're executing bytecode, we know the the bytecode will be
flattened. This means we can use ReadonlySpan to look into it instead of
DisjointChunks::spans(), which allocates.
2025-04-14 17:40:13 +02:00
stelar7
6ec914c7f7 LibWeb/IDB: Add some debug output 2025-04-09 11:48:49 -06:00
Andreas Kling
b2779ad9f7 AK: Shrink Utf16View from 40 bytes to 32 bytes
This ends up making RegexStringView smaller, which means less stuff to
copy when forking in the regex engine.

Thanks to Leon for suggesting the [[no_unique_address]] trick!
2025-04-09 07:22:01 +02:00
Andreas Kling
697e87b7bd AK: Make ""_string and ""_fly_string literals skip UTF-8 validation
We still validate in an ASSERT, but let's not bother with this in
release builds.
2025-04-09 07:22:01 +02:00
rmg-x
92f5183ced AK+Meta: Remove unused class RecursionDecision 2025-04-08 09:13:33 +02:00
rmg-x
fcf3abd19c AK: Remove unused class DOSPackedTime 2025-04-08 09:13:33 +02:00
Timothy Flynn
1d9e226206 AK: Remove unused UTF-8 / other factory methods from ByteString 2025-04-07 17:44:38 +02:00
Timothy Flynn
3f439efe21 AK: Rename StringImpl to ByteStringImpl
StringImpl is very specific to ByteString. Let's rename it to match, to
avoid confusion with the StringBase and StringData classes.
2025-04-07 17:44:38 +02:00
Timothy Flynn
0a256b0a9a AK+Everywhere: Change StringView case conversions to return String
There's a bit of a UTF-8 assumption with this change. But nearly every
caller of these methods were immediately creating a String from the
resulting ByteString anyways.
2025-04-07 17:44:38 +02:00
Timothy Flynn
05627b6f45 AK: Remove unused ByteString titlecase/invert case conversions 2025-04-07 17:44:38 +02:00
Timothy Flynn
c8bb3030fd AK: Return NonnullRefPtr from StringImpl::create methods
None of these return a nullptr.
2025-04-07 17:44:38 +02:00
Timothy Flynn
f029ba6a29 AK: Improve performance of ASCII case conversions
Don't use a Vector to form the transformed string. We can construct the
string immediately and store the result in its buffer, and thus avoid a
double allocation.

In a synthetic benchmark, lowercasing a 500 character ASCII string
1 million times reduced from 550ms to 65ms on my machine.
2025-04-07 17:44:38 +02:00
Timothy Flynn
13d7d3a60d AK: Simplify ASCII case conversion implementations a bit
* Use `any_of` instead of manual loops

* Don't check if a code point is upper/lowercase twice. The check we
  are using is already present inside the case converter.

* Move StringImpl's implementation into ByteString. ByteString is its
  only user, so let's avoid some jumping around. The other ASCII case
  methods on StringImpl will soon also be removed.
2025-04-07 17:44:38 +02:00
Timothy Flynn
ee6b2db009 AK+LibURL+LibWeb: Use simdutf to validate ASCII strings
simdutf provides a vectorized ASCII validator, so let's use that instead
of looping over strings manually.
2025-04-06 11:05:58 -04:00
Timothy Flynn
7f37a8f60f AK: Add an AK::find helper to return a reference to the found value
This is often more convenient than dealing with iterators.

This commit includes a couple conversions to find_value as examples.
2025-04-06 13:45:10 +02:00
rmg-x
37998895d8 AK+Meta+LibCore+Tests: Remove unused SipHash implementation
This is a homegrown implementation that wasn't actually used in
dependent classes. If this is needed in the future, using OpenSSL would
probably be a better option.
2025-04-06 01:47:50 +02:00
rmg-x
6480e1a3fe AK+Tests: Add support for URI syntax in IPv6Address::from_string
This supports IPv6 strings that start with `[` and end with `]` in
accordance with RFC3986 which states:

A host identified by an Internet Protocol literal address, version 6
[RFC3513] or later, is distinguished by enclosing the IP literal
within square brackets ("[" and "]").  This is the only place where
square bracket characters are allowed in the URI syntax.
2025-04-05 14:26:09 -04:00
Andreas Kling
3cf50539ec LibJS: Make Value() default-construct the undefined value
The special empty value (that we use for array holes, Optional<Value>
when empty and a few other other placeholder/sentinel tasks) still
exists, but you now create one via JS::js_special_empty_value() and
check for it with Value::is_special_empty_value().

The main idea here is to make it very unlikely to accidentally create an
unexpected special empty value.
2025-04-05 11:20:26 +02:00
R-Goc
28d5d982ce Everywhere: Remove unused private fields
Some checks are pending
CI / Lagom (x86_64, Fuzzers_CI, false, ubuntu-24.04, Linux, Clang) (push) Waiting to run
CI / Lagom (arm64, Sanitizer_CI, false, macos-15, macOS, Clang) (push) Waiting to run
CI / Lagom (x86_64, Sanitizer_CI, false, ubuntu-24.04, Linux, GNU) (push) Waiting to run
CI / Lagom (x86_64, Sanitizer_CI, true, ubuntu-24.04, Linux, Clang) (push) Waiting to run
Package the js repl as a binary artifact / build-and-package (arm64, macos-15, macOS, macOS-universal2) (push) Waiting to run
Package the js repl as a binary artifact / build-and-package (x86_64, ubuntu-24.04, Linux, Linux-x86_64) (push) Waiting to run
Run test262 and test-wasm / run_and_update_results (push) Waiting to run
Lint Code / lint (push) Waiting to run
Label PRs with merge conflicts / auto-labeler (push) Waiting to run
Push notes / build (push) Waiting to run
This commit removes the -Wno-unusued-private-field flag, thus
reenabling the warning. Unused field were either removed or marked
[[maybe_unused]] when unsure.
2025-04-04 12:40:07 +02:00
Andrew Kaster
4ab89d8bbb AK: Add cxxCast template to allow Swift to perform simple checked casts
This is a workaround for the lack of support for imported class
hierarchies in Swift. Swift's ClangImporter doesn't tell the Swift
frontend about derived class relationships between imported types.
2025-04-03 16:47:48 -06:00
Andrew Kaster
08b27f7b6e AK: Mark Function as SWIFT_UNSAFE_REFERENCE
This allows us to import APIs that include a function without crashing
the frontend. Without this, it chokes on the move-only behavior of the
class.
2025-04-03 16:47:48 -06:00
Timothy Flynn
d403f02988 AK: Remove unused capacity field from StringData
This was added to be used with `kfree_sized` when we construct a String
from a StringBuilder. As of 53cac71cec, it
is unused, causing some compilers to raise a warning.

This reduces the size of StringData from 24 to 16 bytes.
2025-04-03 23:44:40 +02:00
R-Goc
5226a566e9 AK: Modify IntrusiveRedBlackTree for Windows
IntrusiveRedBlackTree relies on a member pointer for accessing the value
of a node. On windows member pointers can be of variable length,
depending on the inheritance structure of the class. This commit casts
the 4 byte member pointer, or rather offset to a full pointer type, so
that the bit_cast to u8* works, as previously the source was smaller
than the destination, which fails inside __builtin_bit_cast().
2025-04-02 10:22:08 -06:00
Kenneth Myhra
82a2ae99c8 Everywhere: Remove DeprecatedFlyString + any remaining references to it
This reverts commit 7c32d1e8a5.
2025-04-02 11:43:13 +02:00
Andreas Kling
7c32d1e8a5 Revert "Everywhere: Remove DeprecatedFlyString + any remaining references to it"
This reverts commit 3131e6369f.

Greatly regressed JavaScript benchmark performance.
2025-04-01 15:40:27 +02:00
Kenneth Myhra
3131e6369f Everywhere: Remove DeprecatedFlyString + any remaining references to it 2025-04-01 12:50:00 +02:00
Ali Mohammad Pur
a83145c751 AK: Don't assert things about active union members in StringBase
This involves yeeting the 'invalid' union member as it was not really
checked against properly anyway; now the 'invalid' state is simply
StringData*{nullptr}, which was assumed to not exist previously.

Note that this is still accessing inactive union members, but is
promising to the compiler that they're fine where they are (the provided
debug macro AK_STRINGBASE_VERIFY_LAUNDER_DEBUG makes the
would-be-UB-if-not-for-launder ops verify that the operation is correct)

Should fix the GCC build.
2025-03-27 15:58:57 +00:00
Ali Mohammad Pur
c39eddaef8 Revert "AK: Don't try to free(UINTPTR_MAX) in StringData::delete()"
This reverts commit 693fe76d1c.
2025-03-27 15:58:57 +00:00
Andreas Kling
693fe76d1c AK: Don't try to free(UINTPTR_MAX) in StringData::operator delete()
Some checks are pending
CI / Lagom (arm64, Sanitizer_CI, false, macos-15, macOS, Clang) (push) Waiting to run
CI / Lagom (x86_64, Fuzzers_CI, false, ubuntu-24.04, Linux, Clang) (push) Waiting to run
CI / Lagom (x86_64, Sanitizer_CI, false, ubuntu-24.04, Linux, GNU) (push) Waiting to run
CI / Lagom (x86_64, Sanitizer_CI, true, ubuntu-24.04, Linux, Clang) (push) Waiting to run
Package the js repl as a binary artifact / build-and-package (macos-14, macOS, macOS-universal2) (push) Waiting to run
Package the js repl as a binary artifact / build-and-package (ubuntu-24.04, Linux, Linux-x86_64) (push) Waiting to run
Run test262 and test-wasm / run_and_update_results (push) Waiting to run
Lint Code / lint (push) Waiting to run
Label PRs with merge conflicts / auto-labeler (push) Waiting to run
Push notes / build (push) Waiting to run
...in constant-evaluated contexts. This was messing up GCC when building
the test262 runner. UINTPTR_MAX is the StringBase "invalid" tag.
2025-03-26 10:47:55 -04:00
Andreas Kling
53cac71cec AK: Inline most StringBase member functions
More work on recovering the performance regression from
DeprecatedFlyString removal.

Local measurements on my MBP:
- 2.5% speedup on Octane/zlib.js
- 2% speedup on Octane/typescript.js
2025-03-26 12:04:00 +00:00
Andreas Kling
7165d69868 AK: Inline more of the String and FlyString member functions
Some checks are pending
CI / Lagom (arm64, Sanitizer_CI, false, macos-15, macOS, Clang) (push) Waiting to run
CI / Lagom (x86_64, Fuzzers_CI, false, ubuntu-24.04, Linux, Clang) (push) Waiting to run
CI / Lagom (x86_64, Sanitizer_CI, false, ubuntu-24.04, Linux, GNU) (push) Waiting to run
CI / Lagom (x86_64, Sanitizer_CI, true, ubuntu-24.04, Linux, Clang) (push) Waiting to run
Package the js repl as a binary artifact / build-and-package (macos-14, macOS, macOS-universal2) (push) Waiting to run
Package the js repl as a binary artifact / build-and-package (ubuntu-24.04, Linux, Linux-x86_64) (push) Waiting to run
Run test262 and test-wasm / run_and_update_results (push) Waiting to run
Lint Code / lint (push) Waiting to run
Label PRs with merge conflicts / auto-labeler (push) Waiting to run
Push notes / build (push) Waiting to run
This is to help recover some of the performance regression from no
longer using DeprecatedFlyString (which was aggressively inlined.)
2025-03-26 02:20:11 +00:00
Andreas Kling
1662223e89 AK: Tweak ShortString bit layout slightly
Move the byte count one step to the left in order to make space
for the JS::StringOrSymbol flag.
2025-03-24 22:27:17 +00:00
Timothy Flynn
3961a4f16a AK: Fully qualify use of move in TemporaryChange
For some reason, after some seemingly unrelated upcoming changes, the
unqualified `move`s in this header result in an ADL failure:

AK/TemporaryChange.h:22:39: error: call to function 'move' that is
neither visible in the template definition nor found by argument-
dependent lookup
   22 |     ~TemporaryChange() { m_variable = move(m_old_value); }
      |                                       ^

Libraries/LibDNS/Resolver.h:491:29: note: in instantiation of member
function 'AK::TemporaryChange<bool>::~TemporaryChange' requested here
  491 |             TemporaryChange change(m_attempting_restart, true);
2025-03-22 17:27:45 +01:00
Andrew Kaster
f22f6e1f5b CMake: Remove unused CMake functions 2025-03-20 11:36:09 -06:00
Timothy Flynn
086a921213 AK: Disallow construction of JsonParser
JsonParser has a footgun where it does not retain ownership of the
string to be parsed. For example, the following results in UAF:

    JsonParser parser(something_returning_a_string());
    parser.parse();

Let's avoid this altogether by only allowing use of JsonParser with
a static, safe method.
2025-03-20 10:50:24 +01:00
Andrew Kaster
01ac48b36f AK: Support storing blocks in AK::Function
This has two slightly different implementations for ARC and non-ARC
compiler modes. The main idea is to store a block pointer as our
closure and use either ARC magic or BlockRuntime methods to manage
the memory for the block. Things are complicated by the fact that
we don't yet force-enable swift, so we can't count on the swift.org
llvm fork being our compiler toolchain. The patch adds some CMake
checks and ifdefs to still support environments without support
for blocks or ARC.
2025-03-18 17:15:08 -06:00
Andrew Kaster
be84ff4f2c AK: Add cast using objective-c __bridge qualifier 2025-03-18 17:15:08 -06:00
Andrew Kaster
0c2f434e69 AK: Add feature detection for -fblocks and -fobjc-arc 2025-03-18 17:15:08 -06:00
Tim Ledbetter
040dca0223 AK: Add first_is_equal_to_all_of()
This method returns true if all arguments are equal.
2025-03-18 21:55:06 +01:00
Shannon Booth
f05c0509c3 AK: Add ability to check Optional equality with an OptionalNone
While we don't want to be writing this type of code in 'normal'
code this is useful to do in tests as:

EXPECT_EQ(my_optional, OptionalNone {});

Has a much better error on assertion failure when compared with:

EXPECT(!my_optional.has_value());
2025-03-15 07:39:03 -04:00
Shannon Booth
f8f21319f9 LibURL/Pattern: Implement the URL Pattern Tokenizer
The tokenizer is used for both pattern string and constructor string
parsing of URL Patterns.
2025-03-15 07:39:03 -04:00