Commit graph

11 commits

Author SHA1 Message Date
Timothy Flynn
418409aa6f AK: Fix usage of constexpr within Utf16View and related utilities
* Error and ErrorOr are not themelves constexpr, so a function returning
  these types cannot be constexpr.

* The UDL was trying to call Utf16View::validate, which is not constexpr
  itself. The compiler will actually already raise an error if a UTF-16
  literal is invalid, so let's just avoid the call altogether.
2025-07-05 01:25:22 +12:00
Timothy Flynn
9fc3e72db2 AK+Everywhere: Allow lonely UTF-16 surrogates by default
By definition, the web allows lonely surrogates by default. Let's have
our string APIs reflect this, so we don't have to pass an allow option
all over the place.
2025-07-03 09:51:56 -04:00
Timothy Flynn
86b1c78c1a AK+Everywhere: Prepare Utf16View for integration with a UTF-16 string
To prepare for an upcoming Utf16String, this migrates Utf16View to store
its data as a char16_t. Most function definitions are moved inline and
made constexpr.

This also adds a UDL to construct a Utf16View from a string literal:

    auto string = u"hello"sv;

This let's us remove the NTTP Utf16View constructor, as we have found
that such constructors bloat binary size quite a bit.
2025-07-03 09:51:56 -04:00
Timothy Flynn
66006d3812 AK+LibJS: Extract some UTF-16 helpers for use in an outside class
An upcoming Utf16String will need access to these helpers. Let's make
them publicly available.
2025-07-03 09:51:56 -04:00
Jonne Ransijn
04920d06f0 AK: Use simdutf when appending UTF-16 to StringBuilder
Adds a fast path for valid UTF-16 using `simdutf`, and fall back to
the slow path for unmatched surrogates.
2024-10-30 10:28:24 +01:00
Dan Klishch
7506736869 AK: Stop using ShortString in String::from_code_point
Refactor it to use StringBase::replace_with_new_short_string instead.
2024-01-21 16:16:15 -07:00
Sam Atkins
067d0689c5 AK: Replace C-style casts 2023-03-09 21:43:54 +01:00
Timothy Flynn
39bda0073e AK: Make StringBuilder::try_append_code_point actually fallible
It currently uses the non-fallible `append` method to append each UTF-8
encoded byte of the code point.
2023-01-08 12:13:15 +01:00
Timothy Flynn
b87e517deb AK: Remove now-unused AK::UnicodeUtils methods 2022-01-18 15:13:25 +00:00
Daniel Bertalan
c8367df746 LibC: Implement wcrtomb
This function converts a single wide character into its multibyte
representation (UTF-8 in our case). It is called from libc++'s
`std::basic_ostream<wchar_t>::flush`, which gets called at program exit
from a global destructor in order to flush `std::wcout`.
2021-10-15 21:50:19 -07:00
Max Wipfli
3c2565da94 AK: Add UnicodeUtils with Unicode-related helper functions
This introduces the UnicodeUtils file, which contains helper functions
related to Unicode. This is in contrast to StringUtils, whose functions
are not directly related to Unicode and are, in theory,
encoding-agnostic.
2021-05-20 22:10:45 +02:00