\u00{80..ff}
characters in string literals are translated incorrectly in some languages
#1133
Labels
\u00{80..ff}
characters in string literals are translated incorrectly in some languages
#1133
For example, if you parse a UTF-8 string with the
U+00A3 POUND SIGN
(£
) character and test it for equality with the"£"
string literal (or equivalently"\u00a3"
), you'll getfalse
in some target languages.Below is a reproducible .ksy snippet that assumes a binary input
c2 a3
(this is the pound sign encoded in UTF-8 using Python:"\u00a3".encode('utf-8').hex(' ') == 'c2 a3'
,"\u00a3" == '£'
):According to my tests,
parsed_eq_literal
will befalse
in C++, Go, Lua, Nim, PHP and Ruby. This indicates that in these languages, the string literal"\u00a3"
was translated incorrectly, as it apparently doesn't represent a UTF-8 string with the U+00A3 character (i.e. the pound sign):In contrast, in C#, Java, JavaScript, Perl, Python and Rust, the
parsed_eq_literal
instance evaluates totrue
, so we can say that"\u00a3"
was translated correctly for these target languages:The text was updated successfully, but these errors were encountered: