Commit graph

15 commits

Author SHA1 Message Date
Andrew Kaster
45301e8169 Everywhere: Remove AK_DONT_REPLACE_STD macro
Let's just always include `<utility>`. Placing our own incompatible with
the STL declaration of these functions in AK was always fishy to begin
with.
2024-07-30 18:38:02 -06:00
Timothy Flynn
ebdb92eef6 LibUnicode+Everywhere: Merge LibLocale back into LibUnicode
LibLocale was split off from LibUnicode a couple years ago to reduce the
number of applications on SerenityOS that depend on CLDR data. Now that
we use ICU, both LibUnicode and LibLocale are actually linking in this
data. And since vcpkg gives us static libraries, both libraries are over
30MB in size.

This patch reverts the separation and merges LibLocale into LibUnicode
again. We now have just one library that includes the ICU data.

Further, this will let LibUnicode share the locale cache that previously
would only exist in LibLocale.
2024-06-23 19:52:45 +02:00
Timothy Flynn
83475c5380 LibUnicode: Replace Unicode string normalization with ICU
In a benchmark, ICU's implementation was over 3x faster than ours.
2024-06-18 21:07:56 +02:00
Idan Horowitz
945c58c7c1 LibUnicode: Generate and use code point composition mappings
These allow us to binary search the code point compositions based on
the first code point being combined, which makes the search close to
O(log N) instead of O(N).
2024-04-06 14:21:04 -04:00
Idan Horowitz
e227bf0f71 LibUnicode: Optimize the canonical composition algorithm implementation
It now takes O(N) time instead of O(N^2) time. Additionally some always
false conditions are removed.
2024-04-06 14:21:04 -04:00
Timothy Flynn
02a8683266 LibUnicode+LibJS: Stop propagating small OOM errors from normalization
This API only perform small allocations, and is only used by LibJS.
2023-09-09 13:03:25 -04:00
Timothy Flynn
58bc831750 LibUnicode: Return a String from Unicode normalization 2023-01-15 01:00:20 +00:00
Timothy Flynn
3d22efccca LibUnicode+LibJS: Propagate OOM from Unicode normalization 2023-01-09 22:48:15 +00:00
Timothy Flynn
d382e77d38 LibUnicode: Fix compilation when the UCD download is disabled 2022-12-14 15:24:48 +00:00
Linus Groh
57dc179b1f Everywhere: Rename to_{string => deprecated_string}() where applicable
This will make it easier to support both string types at the same time
while we convert code, and tracking down remaining uses.

One big exception is Value::to_string() in LibJS, where the name is
dictated by the ToString AO.
2022-12-06 08:54:33 +01:00
Linus Groh
6e19ab2bbc AK+Everywhere: Rename String to DeprecatedString
We have a new, improved string type coming up in AK (OOM aware, no null
state), and while it's going to use UTF-8, the name UTF8String is a
mouthful - so let's free up the String name by renaming the existing
class.
Making the old one have an annoying name will hopefully also help with
quick adoption :^)
2022-12-06 08:54:33 +01:00
Gunnar Beutner
2d3567ee92 Meta+LibUnicode: Avoid relocations for static unicode data
Previously the s_decomposition_mappings variable would refer to other
data in s_decomposition_mappings_data. This would cause thousands of
avoidable relocations at load time.

This saves about 128kB RAM for each process which uses LibUnicode.
2022-11-06 17:34:06 +01:00
matcool
104b51b912 LibUnicode: Fix Hangul syllable composition for specific cases
This fixes `combine_hangul_code_points` which would try to combine
a LVT syllable with a trailing consonant, resulting in a wrong
character.

Also added a test for this specific case.
2022-10-07 07:53:27 -04:00
Timothy Flynn
19b758ce8b LibUnicode: Add to-and-from string converters for NormalizationForm 2022-10-06 22:14:44 +01:00
matcool
70d0c1616f LibUnicode: Add decomposition mappings and Unicode normalization
The mappings are exposed via `Unicode::code_point_decomposition(u32)`
and `Unicode::code_point_decompositions()`, the latter being useful for
reverse searching a code point from its decomposition.

The normalization code does not make use of `Quick_Check` props (https://www.unicode.org/reports/tr44/#Decompositions_and_Normalization),
meaning no quick check optimizations.
2022-10-06 08:24:39 -04:00