mirror of
https://github.com/LadybirdBrowser/ladybird.git
synced 2025-07-29 04:09:13 +00:00
AK: Add AllowSurrogates
to UTF-8 validator
The [UTF-8](https://datatracker.ietf.org/doc/html/rfc3629#page-5) standard says to reject strings with upper or lower surrogates. However, in many standards, ECMAScript included, unpaired surrogates (and therefore UTF-8 surrogates) are allowed in strings. So, this commit extends the UTF-8 validation API with `AllowSurrogates`, which will reject upper and lower surrogate characters.
This commit is contained in:
parent
5f66e31e56
commit
7560b640f3
Notes:
sideshowbarker
2024-07-17 03:35:24 +09:00
Author: https://github.com/dzfrias
Commit: 7560b640f3
Pull-request: https://github.com/LadybirdBrowser/ladybird/pull/96
Reviewed-by: https://github.com/alimpfard ✅
3 changed files with 21 additions and 8 deletions
|
@ -105,11 +105,12 @@ ErrorOr<String> Utf16View::to_utf8(AllowInvalidCodeUnits allow_invalid_code_unit
|
|||
|
||||
TRY(builder.try_append_code_point(static_cast<u32>(*ptr)));
|
||||
}
|
||||
} else {
|
||||
for (auto code_point : *this)
|
||||
TRY(builder.try_append_code_point(code_point));
|
||||
return builder.to_string_without_validation();
|
||||
}
|
||||
|
||||
for (auto code_point : *this)
|
||||
TRY(builder.try_append_code_point(code_point));
|
||||
|
||||
return builder.to_string();
|
||||
}
|
||||
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue