Return consistent version of a US Street Address using stringr::str_*() functions. Letters are capitalized, punctuation is removed or replaced, and excess whitespace is trimmed and squished. Optionally, street suffix abbreviations ("AVE") can be replaced with their long form ("AVENUE"). Invalid addresses from a vector can be removed (possibly using invalid_city) as well as single (repeating) character strings ("XXXXXX").

normal_address(
  address,
  abbs = NULL,
  na = c("", "NA"),
  punct = "",
  na_rep = FALSE,
  abb_end = TRUE
)

Arguments

address

A vector of street addresses (ideally without city, state, or postal code).

abbs

A named vector or two-column data frame (like usps_street) passed to expand_abbrev(). See ?expand_abbrev for the type of object structure needed.

na

A character vector of values to make NA (like invalid_city).

punct

A character value with which to replace all punctuation.

na_rep

logical; If TRUE, replace all single digit (repeating) strings with NA.

abb_end

logical; Should only the last word the string be abbreviated with the abbs argument? Passed to the end argument of str_normal().

Value

A vector of normalized street addresses.

See also

Other geographic normalization functions: abbrev_full(), abbrev_state(), check_city(), expand_abbrev(), expand_state(), fetch_city(), normal_city(), normal_state(), normal_zip(), str_normal()

Examples

normal_address("P.O. #123, C/O John Smith", abbs = usps_street)
#> [1] "PO #123 C/O JOHN SMITH"
normal_address("12east 2nd street, #209", abbs = usps_street, abb_end = FALSE)
#> [1] "12EAST 2ND ST #209"