-
Notifications
You must be signed in to change notification settings - Fork 2
/
README.rmd
114 lines (82 loc) · 5.42 KB
/
README.rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r setup, include = FALSE}
knitr::opts_chunk$set(
# echo = TRUE,
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# collateral <img src="man/figures/logo.svg" align="right" width="180px" style="padding-left: 1rem;" />
<!-- badges: start -->
```{r pkgversion, echo = FALSE}
version <- as.vector(read.dcf("DESCRIPTION")[, "Version"])
version <- gsub("-", ".", version)
```
![packageversion](https://img.shields.io/badge/Package%20version-`r version`-orange.svg?style=flat-square)]
![Last-changedate](https://img.shields.io/badge/last%20change-`r gsub('-', '--', Sys.Date())`-yellowgreen.svg)
[![lifecycle](https://img.shields.io/badge/lifecycle-stable-brightgreen.svg)](https://lifecycle.r-lib.org/articles/stages.html#stable-1)
[![CRAN status](https://www.r-pkg.org/badges/version/collateral)](https://cran.r-project.org/package=collateral)
[![license](https://img.shields.io/github/license/mashape/apistatus.svg)](https://choosealicense.com/licenses/mit/)
<!-- badges: end -->
R is great at automating a data analysis over many groups in a dataset, but errors, warnings and other side effects can stop it in its tracks. There's nothing worse than returning after an hour (or a day!) and discovering that 90% of your data wasn't analysed because one group threw an error.
With `collateral`, you can capture side effects like errors to prevent them from stopping execution. Once your analysis is done, you can see a tidy view of the groups that failed, the ones that returned a results, and the ones that threw warnings, messages or other output as they ran.
You can even filter or summarise a data frame based on these side effects to automatically continue analysis without failed groups (or perhaps to show a report of the failed ones).
For example, this screenshot shows a data frame that's been nested using groups of the `cyl` column. The `qlog` column shows the results of an operation that's returned results for all three groups but also printed a warning for one of them:
![Screenshot of a nested dataframe, including a column that shows the results of an operation quietly mapped with collateral](man/figures/collateral_example.png)
If you're not familiar with `purrr` or haven't used a list-column workflow in R before, the [`collateral` vignette](https://collateral.jamesgoldie.dev/articles/collateral.html) shows you how it works, the benefits for your analysis and how `collateral` simplifies the process of handling complex mapped operations.
If you're already familiar with `purrr`, the [tl;dr](https://en.wikipedia.org/wiki/Wikipedia:Too_long;_didn%27t_read) is that [`collateral::map_safely()` and `collateral::map_quietly()` (and their `map2` and `pmap` variants)](https://collateral.jamesgoldie.dev/reference/collateral_mappers.html) will automatically wrap your supplied function in `safely()` or `quietly()` (or both: `peacefully()`) and will provide enhanced printed output and tibble displays. You can then use the [`has_*()`](https://collateral.jamesgoldie.dev/reference/has.html) and [`tally_*()`](https://collateral.jamesgoldie.dev/reference/tally.html) functions to filter or summarise the returned tibbles or lists.
## Installation
You can install the released version of collateral from [CRAN](https://CRAN.R-project.org) with:
``` r
install.packages("collateral")
```
And the development version from [GitHub](https://github.com/) with:
``` r
# install.packages("devtools")
devtools::install_github("jimjam-slam/collateral")
```
## Example
This example uses the famous `mtcars` dataset---but first, we're going to sabotage a few of the rows by making them negative. The `log` function produces `NaN` with a warning when you give it a negative number.
It'd be easy to miss this in a non-interactive script if you didn't explicitly test for the presence of `NaN`! Thankfully, with collateral, you can see which operations threw errors, which threw warnings, and which produced output:
```{r}
library(tibble)
library(dplyr)
library(tidyr)
library(collateral)
test <-
# tidy up and trim down for the example
mtcars %>%
rownames_to_column(var = "car") %>%
as_tibble() %>%
select(car, cyl, disp, wt) %>%
# spike some rows in cyl == 4 to make them fail
mutate(wt = dplyr::case_when(
wt < 2 ~ -wt,
TRUE ~ wt)) %>%
# nest and do some operations peacefully
nest(data = -cyl) %>%
mutate(qlog = map_peacefully(data, ~ log(.$wt)))
# look at the results
test
```
Here, we can see that all operations produced output (because `NaN` is still output)---but a few of them also produced warnings! You can then see those warnings...
```{r}
test %>% mutate(qlog_warning = map_chr(qlog, "warnings", .null = NA))
```
... filter on them...
```{r}
test %>% filter(!has_warnings(qlog))
```
... or get a summary of them, for either interactive or non-interactive purposes:
```{r}
summary(test$qlog)
```
## Other features
The collateral package is now fully integrated with the `furrr` package, so you can safely and quietly iterate operations across CPUs cores or remote nodes. All collateral mappers have `future_*`-prefixed variants for this purpose.
## Support
If you have a problem with `collateral`, please don't hesitate to [file an issue](https://github.com/jimjam-slam/collateral/issues/new) or [contact me](https://twitter.com/jimjam_slam/)!