-
-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ucdxml and TR42 #859
base: main
Are you sure you want to change the base?
Ucdxml and TR42 #859
Conversation
Comment on June 6 is no longer valid - we're now ready for review. |
@macchiati @eggrobin @markusicu - Please can you review? |
@@ -310,6 +313,15 @@ Unihan_Variants ; kSpoofingVariant | |||
Unihan_Variants ; kTraditionalVariant | |||
Unihan_Variants ; kZVariant |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there be a line for kZhuang here? (In other words, are you getting any data for kZhuang?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current version of UCDXML does not support kZhuang, just kZhuangNumeric.
Similar to Unikemet, we should add support either for the revised 16.0 UCDXML files, or for 17.
cjkRSTUnicode ; kRSTUnicode | ||
cjkReading ; kReading | ||
cjkSrc_NushuDuben ; kSrc_NushuDuben | ||
cjkTGT_MergedSrc ; kTGT_MergedSrc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please revert this change: Tangut and Nüshu are not CJK, they should not have a cjk alias.
The name kReading is unfortunate (since this is really Nüshu-specific), but it is what it is.
I guess you should add the comment that I should have added saying that these are the fields from the Tangut and Nüshu sources files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
unicodetools/src/main/resources/org/unicode/props/ExtraPropertyAliases.txt
Show resolved
Hide resolved
default: | ||
throw new RuntimeException("Missing Catalog case"); | ||
} | ||
case Enumerated: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This (and the associated pile of UnicodeMap
s) seems like it is going to be a bit annoying to maintain as we add properties.
Is there a reason why you are not doing something like
final UnicodeProperty property = indexUnicodeProperties.getProperty(prop);
final List<String> valueAliases = property.getValueAliases(property.getValue(codepoint));
return valueAliases.size() == 1 ? valueAliases.get(0) : valueAliases.get(1);
for most of them (special-casing Decomposition_Type etc. as needed)?
No good reason. I think the code might predate the formal recognition as
properties.
…On Wed, Nov 27, 2024, 07:01 Robin Leroy ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In unicodetools/src/main/java/org/unicode/xml/AttributeResolver.java
<#859 (comment)>
:
> + case Script:
+ return map_script.get(codepoint).getShortName();
+ case Script_Extensions:
+ StringBuilder extensionBuilder = new StringBuilder();
+ String[] extensions = map_script_extensions.get(codepoint).split("\\|", 0);
+ for (String extension : extensions) {
+ extensionBuilder.append(
+ UcdPropertyValues.Script_Values.valueOf(extension)
+ .getShortName());
+ extensionBuilder.append(" ");
+ }
+ return extensionBuilder.toString().trim();
+ default:
+ throw new RuntimeException("Missing Catalog case");
+ }
+ case Enumerated:
This (and the associated pile of UnicodeMaps) seems like it is going to
be a bit annoying to maintain as we add properties.
Is there a reason why you are not doing something like
final UnicodeProperty property = indexUnicodeProperties.getProperty(prop);final List<String> valueAliases = property.getValueAliases(property.getValue(codepoint));return valueAliases.size() == 1 ? valueAliases.get(0) : valueAliases.get(1);
for most of them (special-casing Decomposition_Type etc. as needed)?
—
Reply to this email directly, view it on GitHub
<#859 (review)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACJLEMDD3JBHAYCDTBHLLX32CXNFDAVCNFSM6AAAAABI5USRGKVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDINRVGI4DANZTHA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
PR to make it easy to see what changes have been made to support UCDXML.