You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello,
For the development of Notepad3, we use the UCHARDET Charset Detector.
In issue #1831 we are faced with a problem of poor Japanese "UTF-8" detection which is detected as: TIS-620 (Windows-874 (Thai)) with reliability level of 99% by UCHARDET. 😕
These text editors detect it as UTF-8 and displays it correctly
Notepad++, Editpad Lite 7, Editplus, Notepad2, Notepad2e, Notepad2-mod,
Notepad2-zfuliu and VS Code,!!!
Thanks in advance for your attention.
Have a nice day. hpwamr
Feel free to test the BETA version "Notepad3Portable_5.20.116.2708_BETA.paf.exe.7z" or higher.
See "Notepad3 BETA-channel access #1129" or here Notepad3Portable_5.20.116.2708_BETA.paf.exe.7z.
Although it is an issue of uchardet, it is also an issue of libchardet because it uses the same algorithm as uchardet.
The string is too short for sampling.
If the length of the remaining string with ASCII characters removed is less than 10, accurate sampling is unlikely.
For example, ススト。 is recognized as TIS-620, but ススト。ススト。 is recognized as UTF-8.
Hello,
For the development of Notepad3, we use the UCHARDET Charset Detector.
In issue #1831 we are faced with a problem of poor Japanese "UTF-8" detection which is detected as: TIS-620 (Windows-874 (Thai)) with reliability level of 99% by UCHARDET. 😕
These text editors detect it as UTF-8 and displays it correctly
Notepad2-zfuliu and VS Code,!!!
Here the bad detection as "TIS-620"
Here the correct detection as "UTF-8"
In attachment the original sample: Error Detection encoding_utf-8 (issue #1831).zip
Thanks in advance for your attention.
Have a nice day.
hpwamr
Feel free to test the BETA version "Notepad3Portable_5.20.116.2708_BETA.paf.exe.7z" or higher.
See "Notepad3 BETA-channel access #1129" or here Notepad3Portable_5.20.116.2708_BETA.paf.exe.7z.
Note: "Notepad3Portable BETA" can be used in "2 flavors" (with or without the extension ".7z").
Your comments and suggestions are always welcome... 😃
The text was updated successfully, but these errors were encountered: