-
Notifications
You must be signed in to change notification settings - Fork 383
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CLDR-18155 Order languageData's scripts by number of users #4237
base: main
Are you sure you want to change the base?
CLDR-18155 Order languageData's scripts by number of users #4237
Conversation
757c108
to
cbce758
Compare
Notice: the branch changed across the force-push!
~ Your Friendly Jira-GitHub PR Checker Bot |
@@ -1346,7 +1346,7 @@ XXX Code for transations where no currency is involved | |||
<language type="awa" scripts="Deva"/> | |||
<language type="awa" territories="IN" alt="secondary"/> | |||
<language type="ay" scripts="Latn" territories="BO"/> | |||
<language type="az" scripts="Arab Cyrl Latn" territories="AZ"/> | |||
<language type="az" scripts="Arab Latn Cyrl" territories="AZ"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
likely subtags for az is az_Latn_AZ but I will note there are more Azerbaijani speakers in Iran (az_Arab_IR). It's just that they haven't transitioned to the internet like Azerbaijan has.
@@ -1918,7 +1918,7 @@ XXX Code for transations where no currency is involved | |||
<language type="mai" scripts="Tirh" territories="IN NP" alt="secondary"/> | |||
<language type="mak" scripts="Latn"/> | |||
<language type="mak" scripts="Bugi" territories="ID" alt="secondary"/> | |||
<language type="man" scripts="Latn Nkoo"/> | |||
<language type="man" scripts="Nkoo Latn"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is driven by the Ghana language entry saying there is a large Manding population using Nkoo in Ghana. Really man
is a macrolanguage code so we shouldn't give that much weight to this entry being one writing or another. https://en.wikipedia.org/wiki/Manding_languages
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you check that again? Nkoo figures can be skewed by advocacy.
@@ -2435,7 +2435,7 @@ XXX Code for transations where no currency is involved | |||
<language type="yrk" scripts="Cyrl"/> | |||
<language type="yrl" scripts="Latn"/> | |||
<language type="yua" scripts="Latn"/> | |||
<language type="yue" scripts="Hans Hant" territories="MO"/> | |||
<language type="yue" scripts="Hant Hans" territories="MO"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yue's likely subtag is yue_Hant_HK
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added in-line comments for some languages with interesting changes. Otherwise they all match likely subtags and public records.
public BasicLanguageData setScriptsWithoutPopulation(String scriptTokens) { | ||
List<String> scripts = new ArrayList<>(); | ||
if (scriptTokens != null) { | ||
scripts = Arrays.asList(WHITESPACE_PATTERN.split(scriptTokens)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Splitter ha a split to list, fyi
for (String script : scripts) { | ||
scriptsByPopulation.put(script, 0); | ||
} | ||
return setScripts(scriptsByPopulation); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to build once, make immutable.
CLDR-18155
This sorts the scripts in the SupplementalData tags such that the first script has the highest population. This helps resolve some of the ambiguities interpreting the data.
ALLOW_MANY_COMMITS=true