From fcedd52bc07a4113af0d55641b9ef823be9214db Mon Sep 17 00:00:00 2001 From: QuantScripter <95710662+devpowerplatform@users.noreply.github.com> Date: Thu, 24 Oct 2024 14:58:50 -0500 Subject: [PATCH] Update tidy-data.Rmd (#1557) * Update tidy-data.Rmd it is not easy to try out the two data sets (tb and weather). with this changes, any people can run the code to get the two data sets. Also use one of tidyverse packages to read in data sets * Show how to follow along at home --------- Co-authored-by: Davis Vaughan --- vignettes/tidy-data.Rmd | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/vignettes/tidy-data.Rmd b/vignettes/tidy-data.Rmd index 2bf153a8..f695ef4c 100644 --- a/vignettes/tidy-data.Rmd +++ b/vignettes/tidy-data.Rmd @@ -198,6 +198,8 @@ billboard3 %>% arrange(date, rank) After pivoting columns, the key column is sometimes a combination of multiple underlying variable names. This happens in the `tb` (tuberculosis) dataset, shown below. This dataset comes from the World Health Organisation, and records the counts of confirmed tuberculosis cases by `country`, `year`, and demographic group. The demographic groups are broken down by `sex` (m, f) and `age` (0-14, 15-25, 25-34, 35-44, 45-54, 55-64, unknown). ```{r} +# To run this on your own: +# tb <- readr::read_csv("https://raw.githubusercontent.com/tidyverse/tidyr/main/vignettes/tb.csv") tb <- as_tibble(read.csv("tb.csv", stringsAsFactors = FALSE)) tb ``` @@ -244,6 +246,8 @@ tb %>% pivot_longer( The most complicated form of messy data occurs when variables are stored in both rows and columns. The code below loads daily weather data from the Global Historical Climatology Network for one weather station (MX17004) in Mexico for five months in 2010. ```{r} +# To run this on your own: +# weather <- readr::read_csv("https://raw.githubusercontent.com/tidyverse/tidyr/main/vignettes/weather.csv") weather <- as_tibble(read.csv("weather.csv", stringsAsFactors = FALSE)) weather ```