Skip to content
This repository has been archived by the owner on Dec 12, 2024. It is now read-only.

Add ability to accept Excel metadata files, to resolve encoding problems #28

Open
kerchner opened this issue Dec 18, 2019 · 0 comments
Open
Assignees
Milestone

Comments

@kerchner
Copy link
Member

kerchner commented Dec 18, 2019

The current batch loader CAN correctly process CSV files that contain encoded text (for example, ﺎﺴﺘﻣﺍﺭﺓ ﺶﻛﻭﻯ, resulting in a valid JSON file that is ingested correctly by GW ScholarSpace's rake task. However, typical usage is that metadata is developed in Microsoft Excel, and the metadata file is saved from Excel as a CSV file. Saving as CSV results in a CSV file that garbles the encoded text.

Adding the ability to use an Excel-format metadata spreadsheet would avoid this loss of encoding information.

This should resolve #18 and #23 . This may also provide guidance for a (related, but not identical) solution to issues where https://github.com/gwu-libraries/etd-loader receives text from ProQuest that includes special characters.

@kerchner kerchner added this to the 0.3 milestone Jan 3, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants