AK+Everywhere: Replace custom number parsers with fast_float
Some checks failed
CI / macOS, arm64, Sanitizer_CI, Clang (push) Waiting to run
CI / Linux, x86_64, Fuzzers_CI, Clang (push) Waiting to run
CI / Linux, x86_64, Sanitizer_CI, GNU (push) Waiting to run
CI / Linux, x86_64, Sanitizer_CI, Clang (push) Waiting to run
Package the js repl as a binary artifact / Linux, arm64 (push) Waiting to run
Package the js repl as a binary artifact / macOS, arm64 (push) Waiting to run
Package the js repl as a binary artifact / Linux, x86_64 (push) Waiting to run
Run test262 and test-wasm / run_and_update_results (push) Waiting to run
Lint Code / lint (push) Waiting to run
Label PRs with merge conflicts / auto-labeler (push) Waiting to run
Push notes / build (push) Waiting to run
Build Dev Container Image / build (push) Has been cancelled

Our floating point number parser was based on the fast_float library:
https://github.com/fastfloat/fast_float

However, our implementation only supports 8-bit characters. To support
UTF-16, we will need to be able to convert char16_t-based strings to
numbers as well. This works out-of-the-box with fast_float.

We can also use fast_float for integer parsing.
This commit is contained in:
Timothy Flynn 2025-06-26 19:06:46 -04:00 committed by Tim Flynn
parent 9fc3e72db2
commit 62d9a84b8d
Notes: github-actions[bot] 2025-07-03 13:53:10 +00:00
30 changed files with 413 additions and 3034 deletions

View file

@ -6,8 +6,8 @@
*/
#include <AK/Debug.h>
#include <AK/FloatingPointStringConversions.h>
#include <AK/SourceLocation.h>
#include <AK/StringConversions.h>
#include <AK/Vector.h>
#include <LibTextCodec/Decoder.h>
#include <LibWeb/CSS/CharacterTypes.h>
@ -377,7 +377,7 @@ u32 Tokenizer::consume_escaped_code_point()
}
// Interpret the hex digits as a hexadecimal number.
auto unhexed = AK::StringUtils::convert_to_uint_from_hex<u32>(builder.string_view()).value_or(0);
auto unhexed = AK::parse_hexadecimal_number<u32>(builder.string_view()).value_or(0);
// If this number is zero, or is for a surrogate, or is greater than the maximum allowed
// code point, return U+FFFD REPLACEMENT CHARACTER (<28>).
if (unhexed == 0 || is_unicode_surrogate(unhexed) || is_greater_than_maximum_allowed_code_point(unhexed)) {