forked from aedobbyn/postal
-
Notifications
You must be signed in to change notification settings - Fork 0
/
README.Rmd
398 lines (261 loc) · 13.4 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
---
output: github_document
---
```{r setup, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%",
cache = TRUE
)
```
# postal `r emo::ji("mailbox")`
[![Project Status: Active - The project has reached a stable, usable state and is being actively developed.](http://www.repostatus.org/badges/latest/active.svg)](http://www.repostatus.org/#active)
[![Travis build status](https://travis-ci.org/aedobbyn/postal.svg?branch=master)](https://travis-ci.org/aedobbyn/postal)
[![Coverage status](https://codecov.io/gh/aedobbyn/postal/branch/master/graph/badge.svg)](https://codecov.io/github/aedobbyn/postal?branch=master)
[![AppVeyor Build Status](https://ci.appveyor.com/api/projects/status/github/aedobbyn/postal?branch=master&svg=true)](https://ci.appveyor.com/project/aedobbyn/postal)
[![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/postal)](https://cran.r-project.org/package=postal)
[![rstudio mirror downloads](https://cranlogs.r-pkg.org/badges/grand-total/postal)](https://github.com/metacran/cranlogs.app)
Want an estimate of the price of sending a package somewhere via the US Postal Service? Need to get the USPS shipping zone between two zip codes?
Well, this is a `r emo::ji("package")` for your `r emo::ji("package")`s. `postal` provides a tidy interface to the USPS domestic [zone calc](https://postcalc.postal.com/DomesticZoneChart/) and [post calc](https://postcalc.postal.com/Calculator/) APIs.
<br>
<p align="center">
<img src="https://media.giphy.com/media/iVoiJfBtSsi0o/giphy.gif" alt="im_not_an_owl">
</p>
<br>
### Installation
From [CRAN](https://cran.r-project.org/web/packages/postal/index.html):
```{r, eval=FALSE}
install.packages("postal")
```
The development version:
```{r, eval=FALSE}
# install.packages("devtools")
devtools::install_github("aedobbyn/postal")
```
## Postage Price Calculator
The single postage calculation function, `fetch_mail`, works for flat-rate envelopes and boxes (the kind you pick up at the post office and wrestle with until they fold into a box shape) as well as for packages, which vary by their weight and dimensions.
Currently only destinations in the US are supported.
### Usage
Specify a 5-digit origin zip and destination zip, along with the date and time you're going to be shipping (`"today"` and `"now"` are allowed). Other specifics are optional.
```{r}
library(postal)
```
USPS offers many colorful options to handle all your shipping needs, which are included in the arguments to `fetch_mail`. So to answer the burning question...what if we wanted to ship live animals from Wyoming to Philly by ground on July 2 at 2:30pm in a nonrectangular package??
```{r}
fluffy <- fetch_mail(origin_zip = "88201",
destination_zip = "19109",
shipping_date = "2018-07-02",
shipping_time = "14:30",
live_animals = TRUE,
ground_transportation_needed = TRUE,
pounds = 42,
ounces = 3,
length = 12,
width = 10,
height = 7,
girth = 5,
shape = "nonrectangular",
verbose = FALSE)
```
When will it get there and how much will it cost?
```{r}
fluffy %>%
dplyr::pull(delivery_day)
fluffy %>%
dplyr::pull(retail_price)
```
Finally, the important questions have been answered.
<br>
#### General case
For a more usual case, we'll send a 15lb package from Portland, Maine to Portland, Oregon. The response shows all shipping options along with their prices, dimensions, and delivery dates.
```{r}
(mail <- fetch_mail(origin_zip = "04101",
destination_zip = "97211",
shipping_date = "today",
shipping_time = "now",
pounds = 15,
type = "package",
shape = "rectangular",
show_details = TRUE)) %>%
dplyr::slice(1:3)
mail %>%
dplyr::slice(1:3) %>%
knitr::kable()
```
The web interface should display the same results:
<p align="center">
<img src="./man/figures/portland_to_portland.jpg" alt="post_calc" width="70%">
</p>
`fetch_mail` is a good option if you want to display data in the way USPS does. If you want to compute on prices and dates, you can tidy the dataframe by sending it into `scrub_mail`.
`scrub_mail` replaces `"Not available"`s and empty strings with `NA`s, changes prices to numeric, splits delivery day into a date and time of day (we infer year by the current year and use the 24hr clock), and computes the delivery duration in days.
```{r, warning=FALSE}
mail %>%
scrub_mail() %>%
dplyr::slice(1:3) %>%
dplyr::select(
delivery_date, delivery_by_time,
delivery_duration, retail_price,
click_n_ship_price, dplyr::everything()
)
```
<br>
#### Multiple inputs and error handling
These functions work on a single origin and single destination, but multiple can be mapped into a tidy dataframe. Important parts of the request (`origin_zip`, `destination_zip`, `shipping_date`, and `shipping_time`) are included in the result, making it easier to distinguish different inputs from one another.
By default we try the API 3 times before giving up. You can modify that by changing `n_tries`. If after `n_tries` we still have an error (here, `"foo"` and `"bar"` are not good zips), a `"no_success"` row is returned so that we don't error out on the first failure.
```{r fetch_bad_zips}
origins <- c("11238", "foo", "60647", "80222")
destinations <- c("98109", "94707", "bar", "04123")
purrr::map2_dfr(
origins, destinations,
fetch_mail,
type = "box",
n_tries = 3,
verbose = FALSE
)
```
Similarly, if a response is received but no mail services are found, a dataframe with missing values is returned.
```{r}
fetch_mail(origin_zip = "04101",
destination_zip = "97211",
shipping_date = "3018-07-04", # way in the future!
type = "package",
show_details = TRUE)
```
This approach makes takes care of much of the try-catching you might have to implement, with the aim of making it easier to request a lot of data in one go.
***
<br>
## Zones
Zones! A **zone** is a [representation of distance](https://ribbs.usps.gov/zone_charts/documents/tech_guides/ZoneChartExceptionsWebinar.pdf) between the origin and the destination zip codes. Zones are used in determining postage rates and delivery times.
Sometimes you just need to know the shipping zone between your origin and destination. Or maybe between *all* origins and *all* destinations for some app you're building.
That doesn't sound so bad, but there are `99999^2` or 9,999,800,001 possible 5-digit origin-destination zip combinations in the US. The USPS [Zone Calc](https://postcalc.usps.com/DomesticZoneChart/) tool narrows down that space a bit by trimming zips to their first 3 digits. Every 5 digit zip's information is defined by its 3-digit prefix, except for 5-digit exceptions, which are noted.
### Usage
`fetch_zones_three_digit` lets you find the zone corresponding to a 3-digit origin zip prefix and one or many 3-digit destination zip prefixes.
```{r}
fetch_zones_three_digit(origin_zip = "123",
destination_zip = "581")
```
If no destination is supplied, all valid desination zips and zones are returned for the origin.
```{r}
fetch_zones_three_digit(origin_zip = "321")
```
<br>
#### Multiple zips
You can provide a vector of zips and map them nicely into a long dataframe. Here we ask for all destination zips for these three origin zips.
If an origin zip is supplied that is [not in use](https://en.wikipedia.org/wiki/List_of_ZIP_code_prefixes), it is messaged and included in the output with `NA`s in the other columns. For example, the origin `"001"` is not a valid 3-digit zip prefix.
```{r}
origin_zips <- c("001", "271", "828")
origin_zips %>%
purrr::map_dfr(fetch_zones_three_digit)
```
Similarly, map over both origin and destination zips and end up at a dataframe. `verbose` gives you a play-by-play if you want it. (More on auto-prepending leading 0s to input zips in the [On Digits](https://github.com/aedobbyn/postal#on-digits) section below.)
```{r}
dest_zips <- c("867", "53", "09")
purrr::map2_dfr(origin_zips, dest_zips,
fetch_zones_three_digit,
verbose = TRUE)
```
<br>
<br>
#### Ranges and other features
The USPS zone calc web interface displays zones only as they pertain to destination zip code *ranges*:
<p align="center">
<img src="./man/figures/zone_calc.jpg" alt="zone_calc" width="70%">
</p>
<br>
If you prefer the range representation, you can set `as_range = TRUE`. Instead of a `dest_zip` column, you'll get a marker of the beginning of and end of the range in `dest_zip_start` and `dest_zip_end`.
```{r}
fetch_zones_three_digit("42", "42",
as_range = TRUE)
```
<br>
### Details
You can optionally display other details about the zips, zones, and type of postage the zone designation applies to.
```{r}
fetch_zones_three_digit(origin_zip = "404",
show_details = TRUE)
```
Definitions of these details can be found in `zone_detail_definitions`.
```{r zone_detail_definitions}
zone_detail_definitions %>%
knitr::kable()
```
<br>
### On Digits
The API endpoint used in `fetch_zones_three_digit` accepts exactly 3 digits for the origin zip; it mostly returns 3 digit destination zips, but also some 5 digit exceptions. For that reason,
* If *fewer than 3 digits* are supplied, leading zeroes are added with a message
* e.g. `"8"` becomes `"008"`
* If *more than 5 digits* are supplied, the zip is truncated to the first 5 with a warning
* If the zip is an origin, only the first 3 of those 5 digits are sent to the API
* If the zip is a destination, the `exact_destination` flag determines whether we results for the that destination's 3-digit prefix filter or filter to only the exact 5-digit destination
For example, when a 5-digit destination is supplied and `exact_destination` is `FALSE`, we include results for the destination `962` as well as for the exact one supplied, `96240`.
```{r}
fetch_zones_three_digit(origin_zip = "12358132134558",
destination_zip = "96240",
exact_destination = FALSE)
```
When `exact_destination` is `TRUE`, we filter only to `96240`, which is a 5 digit exception as its zone is different from its 3-digit prefix's.
```{r}
fetch_zones_three_digit(origin_zip = "12358132134558",
destination_zip = "96240",
exact_destination = TRUE)
```
<br>
#### I just want to supply 5 digits
`fetch_zones_three_digit` should cover most 5 digit cases and supply the most information when `show_details` is `TRUE`.
But if you just want to use the equivalent of the ["Get Zone for ZIP Code Pair"](https://postcalc.usps.com/DomesticZoneChart/) tab, you can use `fetch_zones_five_digit`.
```{r}
fetch_zones_five_digit("31415", "92653")
```
Details given when `show_details = TRUE` in `fetch_zones_five_digit` are slightly different than they are for `fetch_zones_three_digit` (see [Details](https://github.com/aedobbyn/postal#details)).
<br>
### All of the data
If you want the most up-to-date zip-zone mappings, `fetch_all` allows you to use the 3 digit endpoint to fetch all possible origins and, optionally, write them to a CSV as you go.
By default we use every possible origin from `"000"` to `"999"`; as of now `"000"` through `"004"` are all not in use along with a smattering of others like `"404"` and `"867"` -- but who knows, they might be used in the future.
```{r, eval=FALSE}
fetch_all(all_possible_origins,
sleep_time = 0.5, # How long to sleep in between requests, on average
write_to = "path/to/my/file.csv")
```
If there's a network error when grabbing a zip, we back off and try a few times and finally write `"no_success"` (rather than `NA`s which indicate that the origin zip is not in use) in the destination zip columns.
What that looks like in the event we switch on the internet between asking for origin `"456"` and origin `"789"`:
```{r, echo=FALSE}
tibble::tribble(
~"origin_zip", ~"dest_zip", ~"zone",
"123", "no_success", "no_success",
"456", "no_success", "no_success",
"789", "005", "7",
"789", "006", "8",
"789", "007", "8",
"789", "008", "8",
"789", "009", "8",
"789", "010", "7",
"...", "...", "..."
)
```
The entire set is also made available from a read-only MySQL database, which you can connect to with these creds:
```
host: knotsql.cimbccxns4ka.us-east-2.rds.amazonaws.com
port: 3306
database: master
user: public
password: password
```
<br>
#### Or some of it, for free
Free as in even less effort than the free as in beer stuff up there.
The `zips_zones_sample` dataset included in this package contains a random sample of 1,000,000 rows of all the 3 digit origin-destination pairs. Load it with:
```{r}
data(zips_zones_sample)
```
It's what you'd get by running `fetch_all(show_details = TRUE)`, waiting a while, and then taking a sample.
```{r}
zips_zones_sample
```
The sample is about a quarter of the total number of rows between all origin prefixes and all destination prefixes, plus the 5 digit exceptions (~4m rows). See it put to use in the [vignette](https://github.com/aedobbyn/postal/blob/dev/vignettes/getting-zoned.Rmd).
<br>
That's it! [Bug reports](https://github.com/aedobbyn/postal/issues) and PRs welcome! `r emo::ji("mailbox_with_mail")`
<p align="center">
<img src="https://media.giphy.com/media/2fTYDdciZFEKZJgY7g/giphy.gif" alt="catch_mailman">
</p>