-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UnicodeDecodeError when in list comprehension #4
Comments
Hi, could you give more complete example i.e. script that fails with all the imports and inputs including OS and python version as I could not reproduce this error. Rafal |
Sorry, it took me a while to get something solid. Python is 3.6.8, and the script and inputs are available if you give me some contact, as they are quite large. In the meantime, this is the resulting segment of code and sequence produced: shuffler2 = Shuffler(d2["seq"].encode('utf-8'), shuf[1]) This is done concurrently in 80 processes. The problematic variable temp prints out as (partially): b'\xc0\x00\x00\x00\x00\x00\x00\x00\x00\x9c\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 This is definitely not in the inputs, as it is drawn from a TSV of genomic sequences, and I cannot find such problematic entry although I am working on it. |
Hi! I am very motivated to solve this, and I am willing to send you my full code if you can run it somewhere with many cores and debug it! |
Hi, please feel free to send it to my GitHub e-mail. I will look at it. |
I am getting an error that I cannot seem to reproduce on a large dataset during multiprocessing. List comprehension like:
[shuffler1.shuffle().decode('utf-8') for i in range(repeats)]
creates a UnicodeDecodeError at a random point using a 30-60 letter string with normal letters. I cannot reproduce it because the following code seems to work:
temp = [shuffler1.shuffle() for i in range(repeats)]
try:
d1["shuffle"] = [x.decode('utf-8') for x in temp]
except:
print(temp)
raise ValueError()
Can you think of a reason this happens?
Cheers!
The text was updated successfully, but these errors were encountered: