-
Notifications
You must be signed in to change notification settings - Fork 12
Tidy data
Peter Desmet edited this page Jan 30, 2019
·
4 revisions
The basis for each mapping process is a tidy dataset. This implies that:
- Each variable forms a column
- Each observation forms a row
- Each type of observational unit forms a table
The provided checklist template in this recipe is a good example of a tidy dataset. For more information on tidy datasets, see Hadley Wickham's paper on tidy data. Starting with untidy data will make the mapping script a lot more complex, with many preparatory steps before you can even start mapping. Therefore, a good dosis of "data hygiene" is essential.
In a tidy dataset, you should be able to start the mapping immediately. However, some small preparatory cleaning steps could be required, such as removing empty rows. For this, you can use the function remove_empty()
from the janitor
package:
input_data %<>% remove_empty("rows")
More cleaning steps could be necessary depending on the specific checklist.
- Home
- Getting started
- Basics
- Ingredients: Source data
- Instructions: R Markdown
- Utensils: Tidyverse functions
- Dinner: Darwin Core data
- Mapping script
- Data preparation
- Mapping
- GitHub
- Publishing data
- Examples