library(zctaCrosswalk)
This package is designed to help answer common analytical questions that arise when working with US ZIP Codes.
Note: the entity which maintains US ZIP Codes (the US Postal Service) does not release a map or crosswalk of that dataset. As a result, most analysts instead use ZIP Code Tabulation Areas (ZCTAs) which are maintained by the US Census Bureau. Census also provides Relationship Files that maps ZCTAs to other geographies.
This package provides the Census Bureau’s “2020 ZCTA to County Relationship File” as a tibble, combines it with useful publicly available metadata (such as State names) and provides convenience functions for querying it.
The main functions in this package are:
?get_zctas_by_state
?get_zctas_by_county
?get_zcta_metadata
?get_zctas_by_state
takes a vector of states and returns
the vector of ZCTAs in those states. Here are some examples:
# Not case sensitive when using state names
head(
get_zctas_by_state("California")
)#> Using column state_name
#> [1] "89010" "89019" "89060" "89061" "89439" "90001"
# USPS state abbreviations are also OK - but these *are* case sensitive
head(
get_zctas_by_state("CA")
)#> Using column state_usps
#> [1] "89010" "89019" "89060" "89061" "89439" "90001"
# Multiple states at the same time are also OK
head(
get_zctas_by_state(c("CA", "NY"))
)#> Using column state_usps
#> [1] "06390" "10001" "10002" "10003" "10004" "10005"
# Throws an error - you can't mix types in a single request
# get_zctas_by_state(c("California", "NY"))
A common problem when doing analytics with states is ambiguity around
names. For example, most people write “Washington, DC”. But this dataset
uses “District of Columbia”. The most common solution to this problem is
to use FIPS
Codes when doing analytics with states. And so
?get_zctas_by_state
also supports FIPS codes.
Note that technically FIPS codes are characters and have a leading
zero (e.g. California is “06”). But in practice people often use numbers
(e.g. 6 for California) as well. As a result,
?get_zctas_by_state
supports both:
= get_zctas_by_state("CA")
ca1 #> Using column state_usps
= get_zctas_by_state("06")
ca2 #> Using column state_fips
= get_zctas_by_state(6)
ca3 #> Using column state_fips_numeric
all(ca1 == ca2)
#> [1] TRUE
all(ca2 == ca3)
#> [1] TRUE
?get_zctas_by_county
works analogously to
?get_zctas_by_state
. The primary difference is that it only
accepts FIPS codes. This is because FIPS county
codes are unique, but their names are not. (For example, 30 counties
in this dataset are named “Washington County”!)
If you need to find the FIPS code for a particular county, I recommend simply googling it (e.g. “FIPS code for San Francisco County California”) or consulting this page.
Note that the FIPS codes can be either character or numeric.
# "06075" is San Francisco County, California
head(
get_zctas_by_county("06075")
)#> Using column county_fips
#> [1] "94102" "94103" "94104" "94105" "94107" "94108"
# 6075 (== as.numeric("06075")) works too
head(
get_zctas_by_county(6075)
)#> Using column county_fips_numeric
#> [1] "94102" "94103" "94104" "94105" "94107" "94108"
# Multiple counties at the same time are also OK
head(
get_zctas_by_county(c("06075", "36059"))
)#> Using column county_fips
#> [1] "11001" "11003" "11010" "11020" "11021" "11023"
?get_zcta_metadata
takes a vector of ZCTAs and returns
all available metadata on them. The ZCTAs can be either character or
numeric.
get_zcta_metadata("90210")
#> # A tibble: 1 × 9
#> zcta zcta_numeric state_name state_usps state_fips state_fips_numeric
#> <chr> <int> <chr> <chr> <chr> <int>
#> 1 90210 90210 california CA 06 6
#> # ℹ 3 more variables: county_name <chr>, county_fips <chr>,
#> # county_fips_numeric <int>
# Some ZCTAs span multiple counties
get_zcta_metadata(39573)
#> # A tibble: 6 × 9
#> zcta zcta_numeric state_name state_usps state_fips state_fips_numeric
#> <chr> <int> <chr> <chr> <chr> <int>
#> 1 39573 39573 mississippi MS 28 28
#> 2 39573 39573 mississippi MS 28 28
#> 3 39573 39573 mississippi MS 28 28
#> 4 39573 39573 mississippi MS 28 28
#> 5 39573 39573 mississippi MS 28 28
#> 6 39573 39573 mississippi MS 28 28
#> # ℹ 3 more variables: county_name <chr>, county_fips <chr>,
#> # county_fips_numeric <int>