Adding new segregation indices is not a big trouble. Please open an issue on GitHub to request an index to be added.
If you use the dplyr
package, one pattern that works well is to use group_modify
. Here, we compute the pairwise Black-White dissimilarity index for each state separately:
library("segregation")
library("dplyr")
%>%
schools00 filter(race %in% c("black", "white")) %>%
group_by(state) %>%
group_modify(~dissimilarity(data = .x,
group = "race",
unit = "school",
weight = "n"))
#> # A tibble: 3 × 3
#> # Groups: state [3]
#> state stat est
#> <fct> <chr> <dbl>
#> 1 A D 0.706
#> 2 B D 0.655
#> 3 C D 0.704
A similar pattern works also well with data.table
:
library("data.table")
= as.data.table(schools00)
schools00
schools00[%in% c("black", "white"),
race dissimilarity(data = .SD, group = "race", unit = "school", weight = "n"),
= .(state)]
by #> state stat est
#> 1: A D 0.7063595
#> 2: B D 0.6548485
#> 3: C D 0.7042057
To compute many decompositions at once, it’s easiest to combine the data for the two time points. For instance, here’s a dplyr
solution to decompose the state-specific M indices between 2000 and 2005:
# helper function for decomposition
= function(df, group) {
diff = filter(df, year == 2000)
data1 = filter(df, year == 2005)
data2 mutual_difference(data1, data2, group = "race", unit = "school", weight = "n")
}
# add year indicators
$year = 2000
schools00$year = 2005
schools05= bind_rows(schools00, schools05)
combine
%>%
combine group_by(state) %>%
group_modify(diff) %>%
head(5)
#> # A tibble: 5 × 3
#> # Groups: state [1]
#> state stat est
#> <fct> <chr> <dbl>
#> 1 A M1 0.409
#> 2 A M2 0.445
#> 3 A diff 0.0359
#> 4 A additions -0.0159
#> 5 A removals 0.0390
Again, here’s also a data.table
solution:
setDT(combine)
diff(.SD), by = .(state)] %>% head(5)
combine[, #> state stat est
#> 1: A M1 0.40859652
#> 2: A M2 0.44454379
#> 3: A diff 0.03594727
#> 4: A additions -0.01585879
#> 5: A removals 0.03903106
tidycensus
to compute segregation indices?Here are a few examples thanks to Kyle Walker, the author of the tidycensus package.
First, download the data:
library("tidycensus")
= get_acs(
cook_data geography = "tract",
variables = c(
white = "B03002_003",
black = "B03002_004",
asian = "B03002_006",
hispanic = "B03002_012"),
state = "IL",
county = "Cook")
#> Getting data from the 2015-2019 5-year ACS
Because this data is in “long” format, it’s easy to compute segregation indices:
# compute index of dissimilarity
%>%
cook_data filter(variable %in% c("black", "white")) %>%
dissimilarity(
group = "variable",
unit = "GEOID",
weight = "estimate")
#> stat est
#> 1: D 0.7860354
# compute multigroup M/H indices
%>%
cook_data mutual_total(
group = "variable",
unit = "GEOID",
weight = "estimate")
#> stat est
#> 1: M 0.5201728
#> 2: H 0.4177068
Producing a map of local segregation scores is also not hard:
library("tigris")
library("ggplot2")
= mutual_local(cook_data,
local_seg group = "variable",
unit = "GEOID",
weight = "estimate",
wide = TRUE)
# download shapefile
= tracts("IL", "Cook", cb = TRUE, progress_bar = FALSE) %>%
seg_geom left_join(local_seg, by = "GEOID")
ggplot(seg_geom, aes(fill = ls)) +
geom_sf(color = NA) +
coord_sf(crs = 3435) +
scale_fill_viridis_c() +
theme_void() +
labs(title = "Local segregation scores for Cook County, IL",
fill = NULL)