-
Notifications
You must be signed in to change notification settings - Fork 12
Source data
The source data is a file with species checklist data you want to use as input for your Darwin Core mapping and thus the basic ingredient of our recipe. Source data could be a file you started from scratch, digitized from a publication or received from someone else.
If you want to make your source data part of your repository, place it in the data/raw
directory. We already added a file checklist.xlsx
as an example. Of course, you are welcome to use any other file or format and figure out how to import it in R (we recommend the R package readr
for text files).
-
Is tidy: each species (distribution) is a row, each attribute of that is a column. Deviation from this structure is possible, but it will complicate further processing.
scientific_name locality occurrence_status species A locality X present species A locality Y absent species B locality X present -
Is where you manage the data. That is not always possible (e.g. if you got the file from someone else), but the shorter you keep the flow from where you manage the data to what you use as input for Darwin Core mapping, the better. At least, try to keep the structure of the source data the same between updates.
-
Is not altered for Darwin Core. You will do that in the mapping script. Keep your source data raw.
If you are starting your checklist from scratch, our recipe comes with a source data template (checklist.xlsx
) to get you started. We decided to use a Microsoft Excel file, as it is often used to manage datasets, despite its limitations (proprietary, limited import options in R, etc.). The template contains the worksheets checklist
for your data, README
with instructions and controlled vocabularies
to populate dropdowns. The template contains fictional data for 12 species to test the recipe, which you can remove and replace with your own data. You are also free to adapt and change the structure of the template as you see fit, but don't forget to adapt the mapping script as well then.
- Home
- Getting started
- Basics
- Ingredients: Source data
- Instructions: R Markdown
- Utensils: Tidyverse functions
- Dinner: Darwin Core data
- Mapping script
- Data preparation
- Mapping
- GitHub
- Publishing data
- Examples