ladybird

mirror of https://github.com/LadybirdBrowser/ladybird.git synced 2025-10-23 00:19:18 +00:00

Author	SHA1	Message	Date
Glenn Skrzypczak	d25d62e74c	AK/Time+LibWeb/HTML: Fix ISO8601 week conversions This reimplements conversions between unix date times and ISO8601 weeks. The new algorithms also do not use loops, so they should be faster.	2025-08-14 11:05:28 -04:00
Timothy Flynn	8472e469f4	AK+LibJS+LibWeb: Recognize that our UTF-16 string is actually WTF-16 For the web, we allow a wobbly UTF-16 encoding (i.e. lonely surrogates are permitted). Only in a few exceptional cases do we strictly require valid UTF-16. As such, our `validate(AllowLonelySurrogates::Yes)` calls will always succeed. It's a wasted effort to ever make such a check. This patch eliminates such invocations. The validation methods will now only check for strict UTF-16, and are only invoked when needed.	2025-08-13 09:56:13 -04:00
Timothy Flynn	36c7302178	AK: Optimize the UTF-16 StringBuilder for ASCII storage When we build a UTF-16 string, we currently always switch to the UTF-16 storage mode inside StringBuilder. Then when it comes time to create the string, we switch the storage to ASCII if possible (by shifting the underlying bytes up). Instead, let's start out with ASCII storage and then switch to UTF-16 storage once we see a non-ASCII code point. For most strings, this will avoid allocating 2x the memory, and avoids many ASCII validation calls.	2025-08-13 09:56:13 -04:00
Timothy Flynn	99d7e08dff	AK: Templatize GenericLexer for UTF-16 strings We now define GenericLexer as a template to allow using it with UTF-16 strings. To keep existing users happy, the template is defined in the Detail namespace. Then AK::GenericLexer is an alias for a char-based view, and AK::Utf16GenericLexer is an alias for a char16-based view.	2025-08-13 09:56:13 -04:00
Timothy Flynn	28d9d3a2c7	AK+Libraries: Reduce API surface of GenericLexer a bit * Remove completely unused methods. * Deduplicate methods that were overloaded with both StringView and char const* parameters. A future commit will templatize GenericLexer by char type. This patch serves to make that a tiny bit easier.	2025-08-13 09:56:13 -04:00
Callum Law	861bcbd9ad	AK: Format floats with precision in scientific notation where applicable	2025-08-11 17:10:04 +01:00
Callum Law	1474da31c7	AK: Reduce duplication between put_f32_or_f64 and put_f64_with_precision We were handling the special cases of NaN and Infinity in basically the same way across both functions - we can reduce code duplication by moving this to before we branch. This is also required as we will be moving the logic to encode in scientific notation above the branch in a later commit and the `convert_floating_point_to_decimal_exponential_form` method doesn't work with non-finite values.	2025-08-11 17:10:04 +01:00
Timothy Flynn	f03c432b52	AK: Use simdutf for searching strings for a single code unit In the following synthetic benchmark, the simdutf version is 4x faster: BENCHMARK_CASE(find) { auto string = u"😀Foo😀Bar"sv; for (size_t i = 0; i < 100'000'000; ++i) (void)string.find_code_unit_offset('a'); }	2025-08-11 16:55:37 +02:00
Idan Horowitz	93692242b9	AK: Implement take_all_matching(predicate) API in HashMap	2025-08-08 13:09:58 -04:00
Idan Horowitz	5097e72174	AK: Implement take_all_matching(predicate) API in HashTable	2025-08-08 13:09:58 -04:00
Ali Mohammad Pur	e47fceed38	AK: Optionally keep track of the last slot in Vector last() and take_last() are extremely common ops when the vector is used like a stack.	2025-08-08 12:54:06 +02:00
Ali Mohammad Pur	2cd4b4e28d	AK: Skip vcalls to Stream::read_value and read_until_filled in LEB128 ...for the first byte. This function only really needs to read a single byte at that point, so read_until_filled() is useless and read_value<u8> is functionally equivalent to just a read. This showed up hot in a wasm parse benchmark.	2025-08-08 12:54:06 +02:00
Ali Mohammad Pur	bf4c436ef3	AK: Add some higher-level operations to DoublyLinkedList<T> This also adds a node cache as allocation/deallocation was showing up in my profiles; disabled by default to keep the old behaviour.	2025-08-08 12:54:06 +02:00
Ali Mohammad Pur	834fb0be36	AK: Make some Stream::read* functions available inline These are quite bottlenecky in wasm, the next commit will try to make use of this by calling them directly instead of doing a vcall, and having them inlineable helps the compiler a bit.	2025-08-08 12:54:06 +02:00
Ali Mohammad Pur	0f13952f30	AK: Simplify some stream reading logic These do the same thing in a less convoluted way. NFC.	2025-08-08 12:54:06 +02:00
Timothy Flynn	298ec6a12a	AK: Ensure StringBuilder encodes U+10000 as 2 UTF-16 code units	2025-08-07 02:05:50 +02:00
Timothy Flynn	1b611fba67	AK: Ensure Utf16FlyString is hash-compatible with Utf16View/Utf16String	2025-08-07 02:05:50 +02:00
Timothy Flynn	274f8ee462	AK: Make hashing of UTF-16 strings cheaper No need to iterate every byte of the string, we can iterate the code units instead. We must also actually record that we have cached the hash :^)	2025-08-07 02:05:50 +02:00
Timothy Flynn	73154defa8	AK: Allow implicitly constructing a Utf16View from a Utf16FlyString The same already works for String and FlyString into StringView, and for Utf16String into Utf16View.	2025-08-07 02:05:50 +02:00
Timothy Flynn	1bc80848fb	AK+LibWeb: Add a UTF-16 starts/ends with wrapper for a single code unit	2025-08-07 02:05:50 +02:00
Timothy Flynn	7082cafdbc	AK: Add a UTF-16 replacement wrapper to replace a single code unit Just for convenience for interop with existing code.	2025-08-07 02:05:50 +02:00
Timothy Flynn	9e0b1bdfca	AK: Add a parameter to to_number methods to change the parsed base This just forwards through to AK::parse_number.	2025-08-07 02:05:50 +02:00
Timothy Flynn	bbda6d13f7	AK: Add a Utf16View method to retrieve an iterator at a code unit offset	2025-08-07 02:05:50 +02:00
Timothy Flynn	6d1f90c739	AK: Remove now-unused UTF-16 length from UTF-8 string helper	2025-08-05 15:13:36 +02:00
Timothy Flynn	2dc0a3b3ce	AK: Add trim methods to Utf16String that skip allocation when not needed If the string does not begin with any of the provided code units, we do not need to create a new string.	2025-08-05 15:13:36 +02:00
Timothy Flynn	cd276235d7	AK: Add a couple of validation-skipping UTF-16 string factories String and FlyString are known to be valid UTF-8, so we can skip validation when constructing a UTF-16 string from them.	2025-08-05 07:07:15 -04:00
Timothy Flynn	782f8c381c	AK: Implement the spaceship operator for UTF-16 strings	2025-08-05 07:07:15 -04:00
Timothy Flynn	5af99f4dd0	AK: Allow Utf16StringBase to hold null data This is required by JS::PropertyKey. This will also be needed when we implement an Optional<Utf16String> specialization.	2025-08-05 07:07:15 -04:00
Timothy Flynn	0bf565b97f	AK: Allow comparing UTF-16 strings to UTF-8 strings Before now, you could compare a Utf16View to a StringView, but it would only be valid if the StringView were ASCII. When porting code to UTF-16, it will be handy to have a code point-aware implementation for non-ASCII StringViews.	2025-08-05 07:07:15 -04:00
Timothy Flynn	319e7aa03b	AK: Do not replace lonely surragates with U+FFFD while iterating Utf8View doesn't do this either. The wobbly format is expected by JS.	2025-08-05 07:07:15 -04:00
Timothy Flynn	13ed6aba71	AK+LibIPC: Implement an encoder/decoder for UTF-16 strings	2025-08-02 10:10:14 -07:00
Aliaksandr Kalenik	d47a22150d	AK: Define `operator==` for HashMap	2025-07-30 11:06:05 +02:00
Grant Knowlton	9e1e4f3b15	AK: Validate compressed tags in IPv4-mapped IPv6 address This disallows parsing IPv4 mapped IPv6 address strings with multiple compression prefixes. Tests are provided for the updated functionality.	2025-07-30 00:53:10 +02:00
Timothy Flynn	d9502505c2	AK: Fix bounds assertions in Utf16View::iterator_offset	2025-07-28 18:30:50 +02:00
Timothy Flynn	67723ef83c	AK: Add a method to peek ahead of a UTF-16 iterator	2025-07-28 18:30:50 +02:00
Timothy Flynn	21d7d236e6	AK: Add a method to check if a UTF-16 string contains any code point	2025-07-28 18:30:50 +02:00
Timothy Flynn	96e75a023b	AK: Implement a UTF-16 UnixDateTime stringifier	2025-07-28 12:25:11 +02:00
Timothy Flynn	ed63a60247	AK: Return an empty optional when UTF-16 code unit lookup fails Accidentally returned the wrong type here.	2025-07-28 12:25:11 +02:00
Timothy Flynn	baddac5155	AK: Implement a method to split a UTF-16 string	2025-07-28 12:25:11 +02:00
Timothy Flynn	48a3b2c28e	AK: Implement a method to count instances of a needle in a UTF-16 string	2025-07-28 12:25:11 +02:00
Timothy Flynn	1375e6bf39	AK+LibJS+LibWeb: Use simdutf to create well-formed strings	2025-07-26 00:40:06 +02:00
Timothy Flynn	a740bfd8ff	AK+LibUnicode: Implement Unicode-aware UTF-16 case transformations	2025-07-25 18:16:22 +02:00
Timothy Flynn	df77ae1920	AK: Implement creating a UTF-16 string from a repeated code point	2025-07-25 18:16:22 +02:00
Timothy Flynn	a46e9b2adb	AK: Compute the correct capacity in StringBuilder::try_append_repeated This was mistakenly broken in `2803d66d87`.	2025-07-25 18:16:22 +02:00
Timothy Flynn	745f288796	AK: Implement a method to acquire a UTF-16 iterator's code unit offset This is the same as Utf8View::iterator_offset().	2025-07-25 18:16:22 +02:00
Jelle Raaijmakers	0b96690f0c	AK: Add HashMap::update() This updates a HashMap by copying another HashMap's keys and values.	2025-07-25 16:22:06 +02:00
Timothy Flynn	6c73dff120	AK: Implement a UTF-16 method to check if a string is ASCII whitespace	2025-07-24 19:00:20 +02:00
Timothy Flynn	f53389bab1	AK: Add a couple of Utf16String factories * Utf16String::from_utf8_with_replacement_character * Utf16String::from_code_point	2025-07-24 19:00:20 +02:00
Jelle Raaijmakers	b1c3ce807b	AK: Rename Utf16View::trim_whitespace() to ::trim_ascii_whitespace() This reflects the naming of String::trim_ascii_whitespace() and better indicates what exactly we're trimming.	2025-07-24 07:18:25 -04:00
Jelle Raaijmakers	9a03ee1c24	AK: Fix mention of renamed member in Utf16View	2025-07-24 07:18:25 -04:00

1 2 3 4 5 ...

3950 commits