Choropleth
1 Objectives
The objective for this section is to create an interactive choropleth map with leaflet.
2 Introduction
Prior to creating an interactive choropleth map in leafet, a creator should ask this question: what’s the benefit to the reader in adding interactivity? The effort will result in larger file sizes with only marginal gain in user insight.
There are two primary benefits to interactivity. The map tiles can assist a user in understanding the geography of the data, particularly where the region is unfamiliar. If the audience is from the U.S., they have seen a map of the U.S. many times. However, if the geographies are census tracts of Michigan’s upper peninsula, then interactivity will aid the discussion. Few people may know that Marquette borders Lake Superior. A second benefit is that the user can hover over a polygon to see the data value. Rather than solely relying on the legend, the user can see the exact value for a geometry and, sometimes, that matters. If neither of those benefits are important, then a static map is the better choice.
There are two tutorials available for building choropleth maps in leaflet. The first is on the Rstudio site and the second is on the “leaflet” site. Both of these were consulted prior to the creation of this section. This section departs from the tutorials in that it adds attributes like color to the underlying data frame rather than in the addPolygons function. The leaflet package contains some convenience functions like colorBin that speed development but add abstraction. Large leaflet maps are clunky, fragile, and difficult to debug when using these convenience functions. I prefer not to use them. This is a personal preference and not a requirement.
The example here is taken from the U.S. Census Bureau’s American Community Survey (ACS) data. The workflow resembles the common process of combining data with geometries, binning the data into groups, creating a color palette, and merging it with the underlying data frame. Then a leaflet map is made with a table in the popup. The data is the percentage of the population 65 years or older by state. Because Hawaii and Alaska are displaced, the map tiles are not loaded as the two states would be superimposed on Mexico.
3 Packages Required
4 Census Data
The first step is to retrieve the data from the ACS. The tidycensus package is used to get the data. tidycensus is “[a]n integrated R interface to several United States Census Bureau APIs and the US Census Bureau’s geographic boundary files. Allows R users to return Census and ACS data as tidyverse-ready data frames, and optionally returns a list-column with feature geometry for mapping and spatial analysis (Walker & Herman, 2024). Prior to retrieving the data, you’ll need to register for an API key (Walker, 2023, Section 2.1). The data is the percentage of the population 65 years or older by state.
5 Convert to Ratio
The variables DP05_0001 and DP05_0029 are the total population and the population 65 years or older, respectively. The percentage of the population 65 years or older is calculated by dividing the population 65 years or older by the total population and multiplying by 100. Puerto Rico and the District of Columbia are removed from the data frame, leaving 50 observations.
acs_2020 %>%
rename_with(~janitor::make_clean_names(.)) %>%
select(-moe) %>%
tidyr::pivot_wider(names_from = variable, values_from = estimate) %>%
dplyr::filter(!name %in% c("Puerto Rico", "District of Columbia")) %>%
mutate(pct_65_or_older = (DP05_0029 / DP05_0001) * 100) %>%
select(geoid, name, pct_65_or_older) %>%
arrange(desc(pct_65_or_older)) -> acs_2020_pct_65_or_older6 Geometries
The geometries are retrieved from the tigris package and form the states’ boundaries. The tigris package “allows users to directly download and use TIGER/Line and cartographic boundary shapefiles from the US Census Bureau in R.” (Walker, 2024) The geometries are filtered to exclude the District of Columbia and Puerto Rico. The shift_geometry function is used to shift the geometries of Hawaii and Alaska to the lower left corner of the map, just below Texas.
7 Merge
Remember the order of the merge matters. “The most common type of attribute join on spatial data takes an sf object as the first argument and adds columns to it from a data.frame specified as the second argument” (Lovelace et al., 2019, Section 3.2.4). The “us_states” data frame is first because it preserves the sf class for the eventual call to leaflet below.
8 Bin Variable
The ggplot2::cut_number function is used to bin the percentage of the population 65 years or older into five groups. The group variable is added to the data frame. An alternative that contains many more methods for binning data is the classInt package. (Bivand, 2023)
9 Create Palette
A multi-hue palette is created with the colorspace package. A singule-hue palette was explored, but the light colors were “washed out” when placed on the gray background.
10 Map
Upon clicking a state polygon, the popup will display the state name and the percentage of the population 65 years or older. The popupTable function is used to create the table and is a function from the leafpop package. The function can save a lot of time of creating customized html for inclusion in the popup. The power of the leafpop package is only hinted at here, but also allows, in addition to a table, the placement of an image or chart (Appelhans & Detsch, 2021). Note also, the leafletOptions function is used to set the coordinate reference system (CRS) to L.CRS.Simple and the minimum zoom level to -100.
library(leaflet)
leaflet(us_states,
options = leafletOptions(
crs = leafletCRS(crsClass = "L.CRS.Simple"),
minZoom = -100)) %>%
addPolygons(data = us_states,
fillColor = ~color,
fillOpacity = 0.7,
color = "white",
weight = 1,
popup = popupTable(us_states, c("name", "pct_65_or_older"))
) %>%
addLegend(
position = "bottomright",
colors = sequential_hcl(5, palette = "PinkYl", rev = TRUE),
labels = levels(us_states$group.x),
title = "Pct. 65 or older",
opacity = 1
)11 Key Points
The
tidycensusandtigrispackages give convenient access to the data and shapefiles of the U.S. Census Bureau.The
shift_geometryfunction can place Hawaii and Alaska either below or outside the continental U.S.leafpopallows for the inclusion of a table, image, or chart in the popup.