stringr 1.5.0
Breaking changes
-
stringr functions now consistently implement the tidyverse recycling rules
(#372). There are two main changes:-
Only vectors of length 1 are recycled. Previously, (e.g.)
str_detect(letters, c("x", "y"))
worked, but it now errors. -
str_c()
ignoresNULLs
, rather than treating them as length 0
vectors.
Additionally, many more arguments now throw errors, rather than warnings,
if supplied the wrong type of input. -
-
regex()
and friends now generate class names withstringr_
prefix (#384). -
str_detect()
,str_starts()
,str_ends()
andstr_subset()
now error
when used with either an empty string (""
) or aboundary()
. These
operations didn't really make sense (str_detect(x, "")
returnedTRUE
for all non-empty strings) and made it easy to make mistakes when programming.
New features
-
Many tweaks to the documentation to make it more useful and consistent.
-
New
vignette("from-base")
by @sastoudt provides a comprehensive comparison
between base R functions and their stringr equivalents. It's designed to
help you move to stringr if you're already familiar with base R string
functions (#266). -
New
str_escape()
escapes regular expression metacharacters, providing
an alternative tofixed()
if you want to compose a pattern from user
supplied strings (#408). -
New
str_equal()
compares two character vectors using unicode rules,
optionally ignoring case (#381). -
str_extract()
can now optionally extract a capturing group instead of
the complete match (#420). -
New
str_flatten_comma()
is a special case ofstr_flatten()
designed for
comma separated flattening and can correctly apply the Oxford commas
when there are only two elements (#444). -
New
str_split_1()
is tailored for the special case of splitting up a single
string (#409). -
New
str_split_i()
extract a single piece from a string (#278, @bfgray3). -
New
str_like()
allows the use of SQL wildcards (#280, @rjpat). -
New
str_rank()
to complete the set of order/rank/sort functions (#353). -
New
str_sub_all()
to extract multiple substrings from each string. -
New
str_unique()
is a wrapper aroundstri_unique()
and returns unique
string values in a character vector (#249, @seasmith). -
str_view()
uses ANSI colouring rather than an HTML widget (#370). This
works in more places and requires fewer dependencies. It includes a number
of other small improvements: -
New
str_width()
returns the display width of a string (#380). -
stringr is now licensed as MIT (#351).
Minor improvements and bug fixes
-
Better error message if you supply a non-string pattern (#378).
-
A new data source for
sentences
has fixed many small errors. -
str_extract()
andstr_exctract_all()
now work correctly whenpattern
is aboundary()
. -
str_flatten()
gains alast
argument that optionally override the
final separator (#377). It gains ana.rm
argument to remove missing
values (since it's a summary function) (#439). -
str_pad()
gainsuse_width
argument to control whether to use the total
code point width or the number of code points as "width" of a string (#190). -
str_replace()
andstr_replace_all()
can use standard tidyverse formula
shorthand forreplacement
function (#331). -
str_starts()
andstr_ends()
now correctly respect regex operator
precedence (@carlganz). -
str_wrap()
breaks only at whitespace by default; set
whitespace_only = FALSE
to return to the previous behaviour (#335, @rjpat). -
word()
now returns all the sentence when using a negativestart
parameter
that is greater or equal than the number of words. (@pdelboca, #245)