Commit graph

26 commits

Author SHA1 Message Date
Timothy Flynn
32c07bc6c3 LibUnicode: Generate per-locale data for the "noon" fixed day period
Note that not all locales have this day period.
2022-07-21 20:36:03 +01:00
Timothy Flynn
12e7c0808a LibUnicode: Generate per-region week data
This includes:
* The minimum number of days in a week for that week to count as the
  first week of a new year.
* The day to be shown as the first day of the week in a calendar.
* The start/end days of the weekend.

Like the existing hour cycle data, week data is presented per-region in
the CLDR, rather than per-locale. The method to add likely subtags to a
locale to perform region lookups is the same.

The list of regions in the CLDR for hour cycle, minimum days, first day,
and weekend days are quite different. So rather than changing the
existing HourCycleRegion enum to a generic Region enum, we generate
separate enums for each of the week data fields. This allows each lookup
into these fields to remain simple array-based index access, without any
"jumps" for regions that don't have CLDR data for a field.
2022-07-06 16:56:42 +02:00
Timothy Flynn
70ede2825e LibUnicode: Use BCP 47 data to filter valid calendar names 2022-02-16 07:23:07 -05:00
Timothy Flynn
63c3437274 LibUnicode: Use BCP 47 data to generate available calendars and numbers
BCP 47 will be the single source of truth for known calendar and number
system keywords, and their aliases (e.g. "gregory" is an alias for
"gregorian"). Move the generation of available keywords to where we
parse the BCP 47 data, so that hard-coded aliases may be removed from
other generators.
2022-02-16 07:23:07 -05:00
Timothy Flynn
6efbafa6e0 Everywhere: Update copyrights with my new serenityos.org e-mail :^) 2022-01-31 18:23:22 +00:00
Timothy Flynn
ebd33e580b LibUnicode: Generate a list of available calendars 2022-01-31 00:32:41 +00:00
Timothy Flynn
4400150cd2 LibJS+LibUnicode: Return the appropriate time zone name depending on DST 2022-01-19 21:20:41 +00:00
Timothy Flynn
8d35563f28 LibUnicode: Implement TR-35's localized GMT offset formatting
This adds an API to use LibTimeZone to convert a time zone such as
"America/New_York" to a GMT offset string like "GMT-5" (short form) or
"GMT-05:00" (long form).
2022-01-11 23:56:35 +01:00
Timothy Flynn
15947aa1f0 LibUnicode: Add an hour-cycle field to DateTimeFormat's format pattern 2022-01-10 16:18:05 +01:00
Timothy Flynn
498b741434 LibUnicode: Use LibTimeZone's list of time zone names
LibUnicode no longer needs to generate a list of time zone names that it
parsed from metaZones.json. We can defer to the TZDB for a golden list
of time zones.
2022-01-08 12:45:34 +01:00
Timothy Flynn
ba4cdf34f8 LibUnicode: Convert UnicodeDateTimeFormat to link with weak symbols 2022-01-04 22:49:43 +00:00
Timothy Flynn
126a3fe180 LibUnicode: Add minimal support for generic & offset-based time zones
ECMA-402 now supports short-offset, long-offset, short-generic, and
long-generic time zone name formatting. For example, in the en-US locale
the America/Eastern time zone would be formatted as:

    short-offset: GMT-5
    long-offset: GMT-05:00
    short-generic: ET
    long-generic: Eastern Time

We currently only support the UTC time zone, however. Therefore, this
very minimal implementation does not consider GMT offset or generic
display names. Instead, the CLDR defines specific strings for UTC.
2022-01-03 15:11:59 +01:00
Timothy Flynn
62ff029890 LibUnicode: Generate CalendarSymbols in a predetermined order
Similar to commit 2a7f36b392, this change moves the generated
CalendarSymbol enumeration to the public LibUnicode/NumberFormat.h
header with a pre-defined set of symbols that we need. This is to
prepare for uniquely generating the CalendarSymbols structure.
2021-12-13 21:28:56 -08:00
Timothy Flynn
a417c23de0 LibUnicode: Parse and generate per-locale day period ranges 2021-12-10 21:27:24 +00:00
Timothy Flynn
fa8e881cfa LibUnicode: Parse and generate secondary day period symbols
Generate morning2, afternoon2, evening2, and night2 symbols.
2021-12-10 21:27:24 +00:00
Timothy Flynn
76aab821f4 LibJS+LibUnicode: Rename some Unicode::DayPeriod values
In the CLDR, there aren't "night" values, there are "night1" & "night2"
values. This is for locales which use a different name for nighttime
depending on the hour. For example, the ja locale uses "夜" between the
hours of 19:00 and 23:00, and "夜中" between the hours of 23:00 and
04:00. Our CLDR parser is currently ignoring "night2", so this rename
is to prepare for that.

We could probably come up with better names, but in the end, the API in
LibUnicode will be such that outside callers won't even see Night1, etc.
2021-12-10 21:27:24 +00:00
Timothy Flynn
2024d9e9ea LibUnicode: Add method to combine two format pattern skeletons
The fields of the generated elements must be in the same order as the
table here:
https://unicode.org/reports/tr35/tr35-dates.html#Date_Field_Symbol_Table

Further, only one field from each group of fields is allowed.
2021-12-09 23:43:04 +00:00
Timothy Flynn
9d4c4303fd LibUnicode: Parse and generate date time range format patterns 2021-12-09 23:43:04 +00:00
Timothy Flynn
fe84a365c2 LibUnicode: Parse and generate format pattern skeletons
Pattern skeletons are more or less the "key" of format patterns. Every
format pattern is assigned a skeleton. Interval patterns (which are not
yet parsed) are also assigned a skeleton - this is used to match them to
an "owning" format pattern. So we will use the skeleton generated here
to match format patterns at runtime with their available interval
patterns.

An alternative approach would be to append interval patterns directly to
their owning format pattern, but this has some draw backs:

    1. Skeletons aren't totally unique. A skeleton may appear in both
       the "dateFormats" and "availableFormats" objects, in which case
       the same interval formats would be generated more than once.

    2. Otherwise unique format patterns may only differ by the interval
       patterns assigned to them. This would cause the UniqueStorage for
       the format patterns to increase in size, impacting both compile
       times and libunicode.so size.
2021-12-09 23:43:04 +00:00
Timothy Flynn
b76e44f66f LibUnicode: Parse and generate time zone names in long and short form 2021-12-08 11:29:36 +00:00
Timothy Flynn
2bbf8aa24c LibUnicode: Generate era, month, weekday and day period calendar symbols
The parsing in parse_calendar_symbols() might be a bit more verbose than
it really needs to be, but it is to ensure the symbols are generated in
a known order that we can control with enumerations.
2021-12-08 11:29:36 +00:00
Timothy Flynn
6ace4000bf LibJS+LibUnicode: Supply field type in CalendarPattern's for-each method
Some callers will want different behavior depending on what field is
being provided to the callback.
2021-12-08 11:29:36 +00:00
Timothy Flynn
16151aa7d5 LibJS+LibUnicode: Implement the Intl.DateTimeFormat constructor 2021-11-29 22:48:46 +00:00
Timothy Flynn
48ce72e472 LibUnicode: Parse and generate regional hour cycles
Unlike most data in the CLDR, hour cycles are not stored on a per-locale
basis. Instead, they are keyed by a string that is usually a region, but
sometimes is a locale. Therefore, given a locale, to determine the hour
cycles for that locale, we:

    1. Check if the locale itself is assigned hour cycles.
    2. If the locale has a region, check if that region is assigned hour
       cycles.
    3. Otherwise, maximize that locale, and if the maximized locale has
       a region, check if that region is assigned hour cycles.
    4. If the above all fail, fallback to the "001" region.

Further, each locale's default hour cycle is the first assigned hour
cycle.
2021-11-29 22:48:46 +00:00
Timothy Flynn
7872934861 LibUnicode: Parse and generate available candidate format patterns
These formats are used by ECMA-402 when neither a date nor time style is
specified. In that case, these patterns are searched for a best match.
2021-11-29 22:48:46 +00:00
Timothy Flynn
f471ecdbe9 LibUnicode: Parse and generate date, time, and date-time format patterns 2021-11-29 22:48:46 +00:00