Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chardet error should be caught and rethrown with helpful error message to user saying which file the problem occurred in #3519

Open
corneliusroemer opened this issue Aug 18, 2024 · 0 comments · May be fixed by #3524
Labels

Comments

@corneliusroemer
Copy link
Contributor

I just ran codespell on all python peps with chardet -e. codespell crashed:

$ codespell --write-changes -i3 -C 5 -H -f -e --count -s --builtin clear,rare,names
Traceback (most recent call last):
  File "/Users/corneliusromer/.local/bin/codespell", line 8, in <module>
    sys.exit(_script_main())
             ^^^^^^^^^^^^^^
  File "/Users/corneliusromer/.local/pipx/venvs/codespell/lib/python3.12/site-packages/codespell_lib/_codespell.py", line 1121, in _script_main
    return main(*sys.argv[1:])
           ^^^^^^^^^^^^^^^^^^^
  File "/Users/corneliusromer/.local/pipx/venvs/codespell/lib/python3.12/site-packages/codespell_lib/_codespell.py", line 1305, in main
    bad_count += parse_file(
                 ^^^^^^^^^^^
  File "/Users/corneliusromer/.local/pipx/venvs/codespell/lib/python3.12/site-packages/codespell_lib/_codespell.py", line 963, in parse_file
    lines, encoding = file_opener.open(filename)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/corneliusromer/.local/pipx/venvs/codespell/lib/python3.12/site-packages/codespell_lib/_codespell.py", line 232, in open
    return self.open_with_chardet(filename)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/corneliusromer/.local/pipx/venvs/codespell/lib/python3.12/site-packages/codespell_lib/_codespell.py", line 257, in open_with_chardet
    lines = f.readlines()
            ^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/[email protected]/3.12.5/Frameworks/Python.framework/Versions/3.12/lib/python3.12/encodings/cp1254.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeDecodeError: 'charmap' codec can't decode byte 0x90 in position 1349: character maps to <undefined>

The chardet error should be caught and rethrown with a better error message for the end user - I don't even know which file it is based on the error message as is!

corneliusroemer added a commit to corneliusroemer/codespell that referenced this issue Aug 18, 2024
resolves codespell-project#3519

Tested and it works now, reporting the file name:

```
codespell --write-changes -i3 -C 5 -H -f -e --count -s --builtin clear,rare,names
Failed to decode file ./pep_sphinx_extensions/tests/pep_lint/test_pep_number.py using detected encoding Windows-1254.
Traceback (most recent call last):
  File "/Users/corneliusromer/micromamba/envs/codespell/bin/codespell", line 8, in <module>
    sys.exit(_script_main())
             ^^^^^^^^^^^^^^
  File "/Users/corneliusromer/code/codespell/codespell_lib/_codespell.py", line 1103, in _script_main
    return main(*sys.argv[1:])
           ^^^^^^^^^^^^^^^^^^^
  File "/Users/corneliusromer/code/codespell/codespell_lib/_codespell.py", line 1300, in main
    bad_count += parse_file(
                 ^^^^^^^^^^^
  File "/Users/corneliusromer/code/codespell/codespell_lib/_codespell.py", line 945, in parse_file
    lines, encoding = file_opener.open(filename)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/corneliusromer/code/codespell/codespell_lib/_codespell.py", line 232, in open
    return self.open_with_chardet(filename)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/corneliusromer/code/codespell/codespell_lib/_codespell.py", line 246, in open_with_chardet
    lines = self.get_lines(f)
            ^^^^^^^^^^^^^^^^^
  File "/Users/corneliusromer/code/codespell/codespell_lib/_codespell.py", line 303, in get_lines
    lines = f.readlines()
            ^^^^^^^^^^^^^
  File "/Users/corneliusromer/micromamba/envs/codespell/lib/python3.12/encodings/cp1254.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeDecodeError: 'charmap' codec can't decode byte 0x90 in position 1349: character maps to <undefined>
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants