Skip to content

Commit

Permalink
Move .by to the data frame method, remove .by handling, misc tweaks
Browse files Browse the repository at this point in the history
  • Loading branch information
DavisVaughan committed Aug 27, 2024
1 parent c346ccb commit 6d437e1
Show file tree
Hide file tree
Showing 5 changed files with 102 additions and 54 deletions.
6 changes: 4 additions & 2 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
# tidyr (development version)

* `fill()` gains a `.by` argument as an alternative to `dplyr::group_by()` for
applying the fill per group, similar to `nest(.by =)` and
`dplyr::mutate(.by =)` (@olivroy, #1439).

* `expand_grid()` gains a new `.vary` argument, allowing users to control
whether the first column varies fastest or slowest (#1543, @JamesHWade).

Expand All @@ -12,8 +16,6 @@

* tidyr now requires dplyr >=1.1.0 (#1568, @catalamarti).

* `fill()` gains `.by` similar to how `mutate()` works (@olivroy, #1439).

# tidyr 1.3.1

* `pivot_wider` now uses `.by` and `|>` syntax for the dplyr helper message to
Expand Down
57 changes: 32 additions & 25 deletions R/fill.R
Original file line number Diff line number Diff line change
Expand Up @@ -9,11 +9,11 @@
#' @section Grouped data frames:
#' With grouped data frames created by [dplyr::group_by()], `fill()` will be
#' applied _within_ each group, meaning that it won't fill across group
#' boundaries.
#' boundaries. This can also be accomplished using the `.by` argument to
#' `fill()`, which creates a temporary grouping for just this operation.
#'
#' @param data A data frame.
#' @param ... <[`tidy-select`][tidyr_tidy_select]> Columns to fill.
#' @inheritParams dplyr::mutate
#' @param .direction Direction in which to fill missing values. Currently
#' either "down" (the default), "up", "downup" (i.e. first down and then up)
#' or "updown" (first up and then down).
Expand Down Expand Up @@ -83,27 +83,39 @@
#' 3, "Danielle", "Observer", NA
#' )
#'
#' # The values are inconsistently missing by position within the group
#' # Use .direction = "downup" to fill missing values in both directions
#' # The values are inconsistently missing by position within the `group`.
#' # Use `.direction = "downup"` to fill missing values in both directions
#' # and `.by = group` to apply the fill per group.
#' squirrels %>%
#' fill(n_squirrels, .direction = "downup", .by = group)
#'
#' # If you want, you can also supply a data frame grouped with `group_by()`,
#' # but don't forget to `ungroup()`!
#' squirrels %>%
#' dplyr::group_by(group) %>%
#' fill(n_squirrels, .direction = "downup") %>%
#' dplyr::ungroup()
#'
#' # Using `.direction = "updown"` accomplishes the same goal in this example
#'
#' # Using .by simplifies this example.
#' squirrels %>%
#' fill(n_squirrels, .direction = "downup", .by = group)
fill <- function(data, ..., .direction = c("down", "up", "downup", "updown"), .by = NULL) {
check_dots_unnamed()
fill <- function(data,
...,
.direction = c("down", "up", "downup", "updown")) {
UseMethod("fill")
}

#' @inheritParams dplyr::mutate
#' @rdname fill
#' @export
fill.data.frame <- function(data, ..., .direction = c("down", "up", "downup", "updown"), .by = NULL) {
vars <- names(tidyselect::eval_select(expr(c(...)), data = data, allow_rename = FALSE))
by <- names(tidyselect::eval_select(enquo(.by), data = data, allow_rename = FALSE))
fill.data.frame <- function(data,
...,
.direction = c("down", "up", "downup", "updown"),
.by = NULL) {
# Must be here rather than in the generic due to the placement of `.by`
check_dots_unnamed()

vars <- names(tidyselect::eval_select(
expr = expr(c(...)),
data = data,
allow_rename = FALSE
))

.direction <- arg_match0(
arg = .direction,
Expand All @@ -114,14 +126,9 @@ fill.data.frame <- function(data, ..., .direction = c("down", "up", "downup", "u
vec_fill_missing(col, direction = .direction)
}

if (dplyr::is_grouped_df(data)) {
if (length(by) > 0) {
cli::cli_abort(c(
"Can't supply {.arg .by} when {.arg data} is a grouped data frame."
))
}
dplyr::mutate(data, dplyr::across(any_of(vars), .fns = fn))
} else {
dplyr::mutate(data, dplyr::across(any_of(vars), .fns = fn), .by = all_of(by))
}
dplyr::mutate(
data,
dplyr::across(any_of(vars), .fns = fn),
.by = {{ .by }}
)
}
24 changes: 14 additions & 10 deletions man/fill.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

25 changes: 19 additions & 6 deletions tests/testthat/_snaps/fill.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,13 @@
# errors on named `...` inputs

Code
fill(df, fooy = x)
Condition
Error in `fill()`:
! Arguments in `...` must be passed by position, not name.
x Problematic argument:
* fooy = x

# validates its inputs

Code
Expand All @@ -6,17 +16,20 @@
Error in `fill()`:
! `.direction` must be one of "down", "up", "downup", or "updown", not "foo".

# fill works with .by
# `.by` can't select columns that don't exist

Code
df %>% fill(y, .by = z)
fill(df, y, .by = z)
Condition
Error in `fill()`:
Error in `dplyr::mutate()`:
! Can't select columns that don't exist.
x Column `z` doesn't exist.

# `.by` can't be used on a grouped data frame

Code
gr_df %>% fill(y, .by = x)
fill(df, y, .by = x)
Condition
Error in `fill()`:
! Can't supply `.by` when `data` is a grouped data frame.
Error in `dplyr::mutate()`:
! Can't supply `.by` when `.data` is a grouped data frame.

44 changes: 33 additions & 11 deletions tests/testthat/test-fill.R
Original file line number Diff line number Diff line change
Expand Up @@ -106,37 +106,59 @@ test_that("fill preserves attributes", {
expect_equal(attributes(out_u$x), attributes(df$x))
})

test_that("fill respects grouping and `.by`", {
test_that("fill respects existing grouping and `.by`", {
df <- tibble(x = c(1, 1, 2), y = c(1, NA, NA))

out <- df %>%
dplyr::group_by(x) %>%
fill(y)
expect_equal(out$y, c(1, 1, NA))
expect_identical(out$y, c(1, 1, NA))
expect_identical(dplyr::group_vars(out), "x")

out <- df %>%
fill(y, .by = x)
expect_equal(out$y, c(1, 1, NA))


expect_identical(out$y, c(1, 1, NA))
expect_identical(dplyr::group_vars(out), character())
})

test_that("works when there is a column named `.direction` in the data (#1319)", {
df <- tibble(x = c(1, NA, 2), .direction = 1:3)
expect_no_error(out <- fill(df, x))
out <- fill(df, x)
expect_identical(out$x, c(1, 1, 2))
})

test_that("errors on named `...` inputs", {
df <- tibble(x = c(1, NA, 2))

expect_snapshot(error = TRUE, {
fill(df, fooy = x)
})
})

test_that("validates its inputs", {
df <- tibble(x = c(1, NA, 2))
expect_snapshot(error = TRUE, {
df %>% fill(x, .direction = "foo")
})
})

test_that("fill works with .by", {
df <- tibble(x = c(1, 1, 2), y = c(1, NA, NA))
gr_df <- dplyr::group_by(df, x)
test_that("`.by` can't select columns that don't exist", {
df <- tibble(x = 1, y = 2)

# This shows `mutate()` in the error, but we accept that to not have to handle
# `.by` in any way.
expect_snapshot(error = TRUE, {
fill(df, y, .by = z)
})
})

test_that("`.by` can't be used on a grouped data frame", {
df <- tibble(x = 1, y = 2)
df <- dplyr::group_by(df, x)

# This shows `mutate()` in the error, but we accept that to not have to handle
# `.by` in any way.
expect_snapshot(error = TRUE, {
df %>% fill(y, .by = z)
gr_df %>% fill(y, .by = x)
fill(df, y, .by = x)
})
})

0 comments on commit 6d437e1

Please sign in to comment.