Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace null char and other control characters #36

Open
mdoering opened this issue Aug 28, 2017 · 2 comments
Open

Replace null char and other control characters #36

mdoering opened this issue Aug 28, 2017 · 2 comments

Comments

@mdoering
Copy link
Member

mdoering commented Aug 28, 2017

I am facing an issue with postgres that fails when having a null char in data.
Instead of replacing every inpout into postgres I would prefer to let the dwca-io lib handle this and (optionally) replace the null char and maybe other control characters

See gbif/checklistbank#38

@cgendreau
Copy link
Contributor

The TabularFileNormalizer can do it but at the moment it can only be used if you rewrite the file.
We could avoid that by simply exposing the normalizeLine method. org.gbif.dwca.record.RecordImpl is already looping on all cells so maybe I would move org.gbif.dwca.record.CleanUtils code into TabularFileNormalizer and create a new method there like normalizeValue.

@mdoering
Copy link
Member Author

sounds good. Would be nice if you can also supply your custom method to clean values. The issue is fixed within clb now, so its not pressing from my side

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants