-
Notifications
You must be signed in to change notification settings - Fork 11
/
Copy pathfoodaccess.qmd
2980 lines (2578 loc) · 116 KB
/
foodaccess.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
---
title: Nutrition Security and Food Access
author: "Allison Bauman"
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(eval = FALSE, message = FALSE, warning = FALSE)
```
The Local and Regional Food Systems Data Warehouse contains data indicators related to [nutrition security and food access](https://localfoodeconomics.com/wp-content/uploads/2022/10/Nutrition-Security-and-Food-Access-Brief_final.pdf) that can help researchers, practitioners, and policymakers better understand which households and communities have reliable access to nutritious and affordable food, and where access might be limited.
Data include:
- Federal nutrition assistance program data, which document eligibility and participation in federal nutrition assistance programs. Data may be used to understand areas where program eligibility and participation is highest and where rates can be improved.
- Food purchase data, which document household food acquisition- including types of food purchased and sources of payment. Data may be used to understand how household income influences food shopping behavior.
- Food store data, which document access to grocery stores, types of food retail in communities, and store acceptance of WIC and SNAP. Data may be used to understand where areas of low supermarket access exist, how access to transportation impacts food purchasing, and where there are opportunities for increased food sources.
- Food preparation data, which document household access to kitchen equipment, time spent on food preparation, and time spent commuting and working outside the home. Data may be used to understand which households lack the time and equipment often needed to prepare nutritious meals.
- Food insecurity data, which document rates of food insecurity among households and special populations. Data may be used to understand how food insecurity rates vary by age, race, and geography.
- Population characteristic data, which document income level, disability status, households with children, older adults, and school enrollment, and other factors that impact eligibility for federal nutrition assistance programs.
Not all households have the same opportunities to make healthy food choices.
Black, Indigenous, and People of Color households are disproportionately affected by barriers to nutrition and food access which can result from community conditions whereby neighborhoods lack healthy food retail, public transit, and well-funded public school systems; working conditions whereby wages are low, hours are long, and places of employment lack healthy food options; generational wealth gaps whereby high rental housing costs and low savings rates make it difficult to weather financial emergencies; or discrimination by social service providers, whereby households are denied enrollment in income and food subsidy programs.
Site users are encouraged to acknowledge the systemic factors that influence household nutrition and food access, and to seek out opportunities to support policy and program changes that improve access for all households. When presenting data, we encourage disaggregation by individual race, ethnicity, and cultural group wherever possible. Aggregation of data can mask important differences that might be relevant for understanding needs and crafting adequate program and policy solutions. We also encourage the use of practices that invite community members to help contextualize data, share their personal stories, and amplify community solutions.
## State and county data
We use [tidycensus](https://walker-data.com/tidycensus/) to get state and county names by FIPS so they are uniform across all data sets.
In the tidycensus data, there is no data for FIPS 02010 Aleutian Islands Census Area, Alaska. This FIPS is found in the Census of Agriculture. We add this fips to our county data based on the [Geographic Area Codes and Titles from the U.S. Bureau of Labor Statistics](https://www.bls.gov/cew/classifications/areas/qcew-area-titles.htm).
```{r}
library(tidyverse, quietly = TRUE)
library(janitor, quietly = TRUE)
library(sf)
# Get county and state fips, state name, county name
county <- tidycensus::fips_codes %>%
unite("fips",
c(state_code, county_code),
sep = "", remove = FALSE) %>%
rename(county_name = county) %>%
select(fips, county_name, state_name)
# Add 02010 Aleutian Islands Census Area, Alaska
county <- county %>%
add_row(fips = "02010",
county_name = "Aleutian Islands Census Area",
state_name = "Alaska",
.after = 67)
# Get state data and add "00" US
state <- tidycensus::fips_codes %>%
select(state_code, state_name) %>%
rename(fips = state_code) %>%
distinct() %>%
add_row(fips = "00", state_name = "US")
# Merge so we have county and state data in one data frame
county_state <- bind_rows(county, state)
# Import county spatial data frame
county_sf <- tigris::counties(progress_bar = FALSE) %>%
clean_names()
```
## SNAP participation
We collect data on [U.S. Department of Agriculture, Food and Nutrition Service, Supplemental Nutrition Assistance Program (SNAP) participation rates, by state](https://www.fns.usda.gov/usamap). These data are not available at the county level from this source. We gather all available data from this source: fiscal year 2018, 2019, and 2020. National data shows on the map for "all eligible people" in fiscal years 2018, 2019 but is not part of the downloadable data; these are added manually.
Data is missing for Guam and Virgin Islands.
After selecting the data of interest, click on "More options" and select "Export to CSV" to download data. Data is exported from the website with the same file name, so years are added manually.
```{r}
# Import and bind data in one data frame, add file name to indicate year
file_list <- fs::dir_ls(path = "data_raw/foodaccess/FNS")
df <- file_list %>%
map(~read_csv(.,show_col_types = FALSE)) %>%
bind_rows(.id = "year") %>%
mutate(
year = str_remove(year, "data_raw/foodaccess/FNS/SNAP-state-rates_"),
year = str_remove(year, ".csv")) %>%
clean_names() %>%
rename(state_name = state)
# Manually add US data
df <- df %>%
add_row(
year = "2018",
state_name = "US",
all_eligible_people = 82) %>%
add_row(
year = "2019",
state_name = "US",
all_eligible_people = 82)
# Add state fips code, rename Virgin Island, U.S. Virgin Islands to match with all other data
df <- df %>%
mutate(
state_name = case_when(
state_name=="Virgin Islands" ~ "U.S. Virgin Islands",
TRUE ~ state_name)) %>%
left_join(state)
# Pivot longer
df <- df %>%
pivot_longer(
cols = !c(fips, year, state_name),
names_to = "variable_name",
values_to = "value")
# Add snap to beginning of variable name and percent to end and divide percent by 100
df <- df %>%
mutate(
variable_name = str_c("snap_", variable_name),
variable_name = str_c(variable_name, "_percent"),
value = value/100)
# Add variables
df <- df %>%
mutate(
county_name = NA,
category = "Food Access",
topic_area = "Federal Nutrition Program",
value_codes = NA) %>%
select(
fips, county_name, state_name,
category, topic_area,
year, variable_name, value, value_codes)
# Get list of years for meta data with a "|" between and add to metadata
years <- df %>%
distinct(year) %>%
pull(year) %>%
paste(collapse = "|") %>%
as_tibble() %>%
rename(years = value)
# Create metadata
meta <- df %>%
bind_cols(years) %>%
group_by(category, topic_area, variable_name) %>%
count() %>%
select(-n) %>%
mutate(
user_friendly_variable_name = case_when(
variable_name == "snap_all_eligible_people_percent" ~
"SNAP participation rates, all eligible people",
variable_name == "snap_elderly_people_percent" ~
"SNAP participation rates, elderly people",
variable_name == "snap_working_poor_people_percent" ~
"SNAP participation rates, working poor people"),
variable_definition = case_when(
variable_name == "snap_all_eligible_people_percent" ~
"Percent of eligible people participating in SNAP",
variable_name == "snap_elderly_people_percent" ~
"Percent of elderly people participating in SNAP",
variable_name == "snap_working_poor_people_percent" ~
"Percent of working poor people participating in SNAP"),
years = "2018|2019|2020",
periodicity = "yearly",
aggregation = "percent",
format = "percent",
keywords = "USDA|FNS|Food and Nutrition Services|Food stamps",
hashtags = "#foodstamps|#getSNAP|#snapmatters|#ebt|#SNAPbenefits",
chart_type1 = "BarChart",
chart_type2 = "LineChartSeries",
chart_axis_x1 = "SNAP participation rate",
chart_axis_x2 = NA,
chart_axis_y1 = NA,
chart_axis_y2 = chart_axis_x1,
source = "U.S. Department of Agriculture, Food and Nutrition Service, Supplemental Nutrition Assistance Program (SNAP)",
url = "https://www.fns.usda.gov/usamap",
citation = "U.S. Department of Agriculture, Food and Nutrition Service, Supplemental Nutrition Assistance Program (SNAP) participation rates, by state")
# Define FNS metadata
meta_fns <- meta
df_fns <- df
rm(df, meta)
```
## SNAP participation by race
We collect county-level data on SNAP participation by race from the [U.S. Census Bureau, American Community Survey, S2201 Food Stamps/Supplemental Nutrition Assistance Program (SNAP)](https://data.census.gov/cedsci/table?q=food%20assistance&tid=ACSST5Y2020.S2201). We gather 5-year estimates from 2012-2022. (Data are available starting in 2010, if you want data from prior years). We select Geography, County, All Counties within the United States and Puerto Rico, and download table data. Data from these tables are available at the US, state, and county levels.
An ''-'' entry in the estimate column indicates that either no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution. We code this as NA and indicate an "D" in the values_column for not disclosed.
Note, the variable names from 2012-2014 are different than from 2015-2022. Variables include all available data on households receiving SNAP/food stamps by race.
```{r}
# Import metadata for variables of interest
data_path <- list.dirs(path = "data_raw/foodaccess/ACS_snap")
files <- list.files(path = data_path,
pattern = "Metadata",
full.names = TRUE)
meta <- map(files,
~read_csv(.x,
show_col_types = FALSE)) %>%
set_names(basename(files)) %>%
enframe(name = "year", value = "data") %>%
unnest(cols = data) %>%
mutate(year = str_remove(year, "ACSST5Y"),
year = str_remove(year, ".S2201-Column-Metadata.csv")) %>%
filter(str_detect(Label,
"RACE AND HISPANIC OR LATINO ORIGIN OF HOUSEHOLDER|Hispanic or Latino origin")) %>% clean_names()
# Split by years where variable names change
meta1214 <- meta %>%
filter(year %in% c("2012", "2013", "2014")) %>%
filter(str_detect(label, "Household") &
!str_detect(label, "not") &
!str_detect(label, "Margin"))
# Define variable name and user friendly variable name
# Drop race category "one race", and change 2012-2014 to say race alone (e.g., black alone) to match future years
meta1214 <- meta1214 %>%
filter(
label != "Households receiving food stamps!!Estimate!!RACE AND HISPANIC OR LATINO ORIGIN OF HOUSEHOLDER!!One race") %>%
mutate(
label = str_remove(label,
"Households receiving food stamps!!Estimate!!RACE AND HISPANIC OR LATINO ORIGIN OF HOUSEHOLDER!!"),
label = str_remove(label,
"Households receiving food stamps!!Estimate!!"),
label = case_when(
str_detect(label, "One race!!") ~ str_c(label, " alone"),
TRUE ~ label),
label = str_remove(label,
"One race!!"),
label = str_remove(label,
"Households receiving food stamps/SNAP!!Estimate!!RACE AND HISPANIC OR LATINO ORIGIN OF HOUSEHOLDER!!"))
# Define metadata for years 2015-2016
meta1516 <- meta %>%
filter(year %in% c("2015", "2016")) %>%
filter(str_detect(label, "Percent households receiving food stamps/SNAP!!Estimate!!")) %>%
mutate(
label = str_remove(label,
"Percent households receiving food stamps/SNAP!!Estimate!!RACE AND HISPANIC OR LATINO ORIGIN OF HOUSEHOLDER!!"),
label = str_remove(label,
"Percent households receiving food stamps/SNAP!!Estimate!!"))
# Meta for 2017-2022
meta1722 <- meta %>%
filter(year %in% c("2017", "2018", "2019",
"2020", "2021", "2022")) %>%
filter(str_detect(label,
"Estimate!!Percent households receiving food stamps/SNAP!!Households!!")) %>%
mutate(label = str_remove(label,
"Estimate!!Percent households receiving food stamps/SNAP!!Households!!RACE AND HISPANIC OR LATINO ORIGIN OF HOUSEHOLDER!!"))
# Define all metadata
meta <- bind_rows(meta1214, meta1516, meta1722)
rm(meta1214, meta1516, meta1722)
# Add variables
meta <- meta %>%
mutate(
label = case_when(
str_detect(label, "Two") |
str_detect(label, "Some") ~ tolower(label),
TRUE ~ label),
user_friendly_variable_name =
str_c("SNAP participation rate, ", label),
variable_name =
str_remove(label, " alone"),
variable_name =
str_remove(variable_name, " origin \\(of any race\\)"),
variable_name =
tolower(str_replace_all(variable_name, ",", "")),
variable_name =
str_replace_all(variable_name, " ", "_"),
variable_name =
str_c("snap_participation_rate_", variable_name)) %>%
rename(variable_code = column_name)
# Keep metadata for years, variable codes and names to determine vars to keep in df
metadf <- meta %>%
select(year, variable_code, variable_name)
# Get list of years for meta data with a "|" between and add to metadata
years <- meta %>%
distinct(year) %>%
pull(year) %>%
paste(collapse = "|") %>%
as_tibble() %>%
rename(years = value)
# keep only distinct entries and add years
meta <- meta %>%
select(-year) %>%
distinct() %>%
bind_cols(years)
# Define metadata for acs snap data and add columns
meta <- meta %>%
mutate(
category = "Food Access",
topic_area = "Federal Nutrition Program",
variable_definition =
str_c("Percent of eligible ", label, " households participating in SNAP"),
years = "2012|2013|2014|2015|2016|2017|2018|2019|2020|2021|2022",
periodicity = "yearly",
aggregation = "percent",
format = "percent",
keywords = "American Community Survey|ACS|Census Bureau|SNAP",
hashtags = "",
chart_type1 = "BarChart",
chart_type2 = "LineChartSeries",
chart_axis_x1 = "SNAP participation rate",
chart_axis_x2 = NA,
chart_axis_y1 = NA,
chart_axis_y2 = chart_axis_x1,
source = "U.S.Census Bureau, American Community Survey",
url =
"https://data.census.gov/cedsci/table?q=food%20assistance&tid=ACSST5Y2020.S2201",
citation = "U.S. Census Bureau, American Community Survey, S2201 Food Stamps/Supplemental Nutrition Assistance Program (SNAP)")
# Import data
data_path <- list.dirs(path = "data_raw/foodaccess/ACS_snap")
files <- list.files(path = data_path,
pattern = "Data",
full.names = TRUE)
df <- map(files,
~read_csv(.x,
show_col_types = FALSE)) %>%
set_names(basename(files)) %>%
enframe(name = "year", value = "data") %>%
unnest(cols = data) %>%
mutate(year = str_remove(year, "ACSST5Y"),
year = str_remove(year, ".S2201-Data.csv"))
# Get fips from GEO_ID and keep variable of interest
df <- df %>% mutate(
fips = case_when(
str_detect(GEO_ID, "0400000") ~ str_sub(GEO_ID,-2),
str_detect(GEO_ID, "0100000") ~ "00",
str_detect(GEO_ID, "0500000") ~ str_sub(GEO_ID, -5)))
# For each year, there is a GEO_ID "Geography" without values,
# we drop these rows
df <- df %>% filter(
GEO_ID != "Geography") %>%
select(-GEO_ID)
# Pivot longer
df <- df %>% pivot_longer(
cols = !c(fips, year),
names_to = "variable_code",
values_to = "value")
# keep variables of interest
df <- metadf %>%
left_join(df, by = c("year", "variable_code"))
# Convert "-" to NA and add value_codes for data not disclosed ("D")
df <- df %>%
mutate(
value_codes = case_when(
value == "-" ~ "D",
TRUE ~ NA),
value = case_when(
value == "-" ~ NA,
TRUE ~ as.numeric(value)),
value = value/100)
# Add state and county names
df <- df %>% left_join(county_state)
# Add variables
df <- df %>%
mutate(
topic_area = "Federal Nutrition Program",
category = "Food Access") %>%
select(
fips, county_name, state_name, category,
topic_area, year, variable_name, value, value_codes)
# define dataframes
df_acs_snap <- df
meta_acs_snap <- meta
rm(df, meta)
```
## WIC Eligibility and participation rates, state-level
State-level WIC eligibility and participation data is gathered from the [U.S. Department of Agriculture, Food and Nutrition Service, National and State-level estimates of WIC Eligibility and Program Reach in 2021](https://www.fns.usda.gov/research/wic/eligibility-and-program-reach-estimates-2021). We gather all available data, 2016-2021. Data is downloaded from the WIC Eligibility and Participation by State Over Time, 2016--2021 chart by clicking on the download data button at the top of the figure.
Data for the total population, eligible population, participant population, coverage rates, and eligibility rates are each includes on a different sheet. We import population data and compute coverage and eligibility rates.
```{r}
library(readxl)
# Define data path
file_path <- "data_raw/foodaccess/Figure6-trends-in-wic-eligibility-and-participation-by-state.xlsx"
sheet_names = c("Total Population",
"Eligible Population",
"Participant Population")
# Read each sheet of data into a list, pivot longer and add a variable to indicate sheet
df_list <- sheet_names %>%
map(function(sheet){
read_xlsx(path = file_path,
sheet = sheet,
range = "A3:G55") %>%
pivot_longer(
cols = !State,
names_to = "year",
values_to = "value") %>%
mutate(
variable = sheet)
})
# Bind all data from list into one data frame
df <- bind_rows(df_list) %>%
pivot_wider(names_from = variable,
values_from = value) %>%
clean_names()
# Add a national total row
df_us <- df %>%
group_by(year) %>%
summarise(across(total_population:participant_population,
~sum(.))) %>%
mutate(state = "US")
df <- bind_rows(df, df_us)
rm(df_us)
# Calculate coverage and eligibility rates and drop original variables
df <- df %>%
mutate(
wic_eligibility_rate = eligible_population/total_population,
wic_coverage_rate = participant_population/eligible_population) %>%
select(state, year, ends_with("_rate"))
# Add fips
df <- df %>%
rename(state_name = state) %>%
left_join(state)
# Pivot longer
df <- df %>%
pivot_longer(
cols = !c(fips, state_name, year),
values_to = "value",
names_to = "variable_name")
# Add variables
df <- df %>%
mutate(
county_name = NA,
topic_area = "Federal Nutrition Program",
category = "Food Access") %>%
select(
fips, county_name, state_name,
category, topic_area, year, variable_name, value)
# Get list of years for meta data with a "|" between and add to metadata
years <- df %>%
distinct(year) %>%
pull(year) %>%
paste(collapse = "|") %>%
as_tibble() %>%
rename(years = value)
# Add metadata
meta <- df %>%
bind_cols(years) %>%
group_by(category, topic_area, variable_name) %>%
count() %>% select(!n) %>%
mutate(
user_friendly_variable_name = case_when(
str_detect(variable_name, "eligibility") ~ "WIC eligibility rate",
str_detect(variable_name, "coverage") ~ "WIC coverage rate"),
variable_definition = case_when(
variable_name == "wic_eligibility_rate" ~ "Percent of population who were eligible to participate in WIC",
variable_name == "wic_coverage_rate" ~ "Percent of eligible people who participated in WIC"),
years = "2016|2017|2018|2019|2020|2021",
periodicity = "yearly",
aggregation = "percent",
format = "percent",
keywords = "USDA|FNS|Food and Nutrition Services|Food stamps|WIC|Women",
hashtags = "#WICBreastfeeding|#WICspeaksBF",
chart_type1 = "BarChart",
chart_type2 = "LineChartSeries",
chart_axis_x1 = case_when(
variable_name == "wic_eligibility_rate" ~ "WIC eligibility rate",
variable_name == "wic_coverage_rate" ~ "WIC coverage rate"),
chart_axis_x2 = NA,
chart_axis_y1 = NA,
chart_axis_y2 = chart_axis_x1,
source = "U.S. Department of Agriculture, Food and Nutrition Service",
url = "https://www.fns.usda.gov/research/wic/eligibility-and-program-reach-estimates-2021",
citation = "U.S. Department of Agriculture, Food and Nutrition Service, National and State Level Estimates of WIC Eligibility and Program Reach in 2021")
# define file
meta_wic <- meta
df_wic <- df
```
## Food Environment data
Data on the food environment is collected from the [U.S. Department of Agriculture, Economic Research Service, Food Environment Atlas, Food Environment Atlas](https://www.ers.usda.gov/data-products/food-environment-atlas/data-access-and-documentation-downloads). We download the data set called "Food Environment Atlas .csv Files last updated 9/10/2020."
We gather data on food access by race and food stores. We then create an index out of these variables to represent retail density (not including SNAP-authorized stores). This index will be based on the highest number of food stores/1,000 pop (grocery, supercenter, convenience, specialized) in a county. A retail density index of 1 will represent the county in the U.S. with the highest retail density.
We also use data from SupplementalDataState to compute state and national totals for the data that we can.
Data that were not available, not applicable, or suppressed for specific counties are denoted with a blank cell. As we do not have information on suppression, we assume all blank cells are NA.
```{r}
# read in meta data
file_path <- "data_raw/foodaccess/FoodEnvironmentAtlas/VariableList.csv"
meta <- read_csv(file_path,
show_col_types = FALSE) %>%
clean_names()
# Keep variables of interest and change variable name and units
meta <- meta %>%
filter(
(category_code == "ACCESS" & units != "% change") |
(category_code == "STORES" & units != "% change")) %>%
mutate(
year = str_remove(variable_name, ".*,"),
variable_name =
str_replace(variable_name,
",[^,]*$", ""),
variable_name =
str_replace(variable_name,
"\\(%\\)", "percent"),
units = case_when(
units == "# per 1,000 pop" ~ "number per 1,000 pop",
TRUE ~ tolower(units)))
# Separate food store and low access data for easier creation of user_friendly_variable_name, we want access to be first and demographics second
race <- c("White", "Black", "Hispanic ethnicity",
"Asian", "American Indian or Alaska Native",
"Hawaiian or Pacific Islander")
meta1 <- meta %>%
filter(category_code == "ACCESS") %>%
mutate(
var1 = str_extract(variable_name, "^[^,]*"),
var1 = case_when(
var1 %in% race ~ var1,
TRUE ~ tolower(var1)),
var2 = str_remove(variable_name, "^[^,]*"),
var2 = str_to_sentence(str_remove(var2, ", ")),
variable_name = case_when(
var2== "" ~ str_to_sentence(var1),
TRUE ~ str_c(var2, var1, sep = ", ")),
variable_name = str_replace(variable_name,
"store percent",
"store, percent"))
meta2 <- meta %>%
filter(category_code != "ACCESS")
meta <- bind_rows(meta1, meta2)
rm(meta1, meta2)
# Add variable definition
meta <- meta %>%
mutate(
variable_name = str_replace(variable_name,
"&", "and"),
def1 = case_when(
units %in% c("count", "number per 1,000 pop") ~ "Number ",
units == "percent" ~ "Percent "),
def2 = case_when(
subcategory_name == "Demographics" ~
"of individuals in a county who are ",
subcategory_name == "Overall" ~
"of people",
str_detect(var1, "low income & low access to store") ~
"of people in a county with low income and living more than 1 mile from a supermarket or large grocery store if in an urban area, or more than 10 miles from a supermarket or large grocery store if in a rural area",
str_detect(var1, "households") ~
"of housing units in a county without a car and more than 1 mile from a supermarket or large grocery store",
TRUE ~ NA),
def3 = case_when(
subcategory_name %in% c("Demographics", "Overall", "Household Resources") ~
" in a county living more than 1 mile from a supermarket or large grocery store if in an urban area, or more than 10 miles from a supermarket or large grocery store if in a rural area",
TRUE ~ NA),
variable_definition = case_when(
subcategory_name == "Demographics" ~
str_c(def1, def2, var1, def3),
subcategory_name == "Overall" ~
str_c(def1, def2, def3),
str_detect(var1, "low income & low access to store") |
str_detect(var1, "households") ~
str_c(def1, def2),
category_name == "Store Availability" ~
str_c("Number of ", tolower(variable_name), " in a county")))
# Define axis titles
meta <- meta %>%
mutate(
axis = str_replace_all(def2, "\\bin\\b.*", ""),
axis = case_when(
str_detect(axis, "of housing units") ~ "of housing units",
TRUE ~ axis),
axis = case_when(
is.na(var1) ~ str_remove(variable_definition, " in a county"),
TRUE ~ str_c(def1, axis, sep = "")))
# Add columns
meta <- meta %>%
mutate(
`2 pager title` = "Nutrition Security and Food Access",
user_friendly_variable_name = variable_name,
category = case_when(
category_code == "ACCESS" ~ "Food Access",
category_code == "STORES" ~ "Food Retail"),
topic_area = "Food Store",
variable_name = tolower(variable_code),
years = case_when(
category_code == "ACCESS" ~ "2010|2015",
category_code == "STORES" &
subcategory_name != "SNAP-authorized" ~ "2011|2016",
category_code == "STORES" &
subcategory_name == "SNAP-authorized" ~ "2012|2017"),
periodicity = "every five years",
aggregation = units,
format = case_when(
units == "percent" ~ "percent",
TRUE ~ "integer"),
keywords = "USDA|Food Environment Atlas|Household Resources|Access and Proximity to Grocery Store|Store Availability",
hashtags = "#foodaccess|#foodsecurity|#foodjustice",
chart_type1 = "BarChart",
chart_type2 = "ColumnChart",
chart_axis_x1 = axis,
chart_axis_x2 = NA,
chart_axis_y1 = NA,
chart_axis_y2 = chart_axis_x1,
source = "U.S. Department of Agriculture, Economic Research Service, Food Environment Atlas, Food Environment Atlas",
url = "https://www.ers.usda.gov/data-products/food-environment-atlas/data-access-and-documentation-downloads",
citation = "U.S. Department of Agriculture, Economic Research Service, Food Environment Atlas, Food Environment Atlas",
last_update_date = "12/30/23") %>%
select(`2 pager title`, category, topic_area, variable_name,
user_friendly_variable_name, variable_definition,
years, periodicity, aggregation, format,
keywords, hashtags,
chart_type1, chart_type2,
chart_axis_x1, chart_axis_x2, chart_axis_y1,
chart_axis_y2, source, url, citation, last_update_date)
# read in data
file_path <- "data_raw/foodaccess/FoodEnvironmentAtlas/StateAndCountyData.csv"
df <- read_csv(file_path,
col_types = cols(FIPS = "c"),
show_col_types = FALSE) %>%
clean_names()
# Add a leading zero into fips and create variable name to match meta data
df <- df %>%
mutate(
variable_name = tolower(variable_code),
fips = str_pad(fips, side = "left", pad = "0", width = "5")) %>%
select(fips, variable_name, value)
# Merge with metadata to keep only data of interest
var_list <- meta %>%
select(category, topic_area, variable_name)
df <- left_join(var_list, df)
# Remove year from variable name and create a year column
df <- df %>%
mutate(year = str_sub(variable_name, -2),
year = str_c("20", year),
variable_name = str_sub(variable_name, 1, -3))
# Divide all percentages by 100
df <- df %>%
mutate(
value = case_when(
str_detect(variable_name, "pct") ~ value/100,
TRUE ~ value))
# Join with county data
df <- df %>% left_join(county)
# Add state-level population data to calculate state/US-level percentages
file_path <- "data_raw/foodaccess/FoodEnvironmentAtlas/SupplementalDataState.csv"
df_state_pop <- read_csv(file_path,
show_col_types = FALSE) %>%
clean_names() %>%
filter(str_detect(variable_code, "State_Population"))
# Add year variable, pad fips, add state names
df_state_pop <- df_state_pop %>%
mutate(
year = str_sub(variable_code, -2),
year = str_c("20", year),
fips = str_pad(state_fips, side = "left", pad = "0", width = "2")) %>%
rename(population = value) %>%
select(fips, year, population) %>%
left_join(state)
# Add US population by year
df_us_pop <- df_state_pop %>%
group_by(year) %>%
summarise(population = sum(population)) %>%
mutate(
fips = "00",
state_name = "US")
# Bind rows
pop <- bind_rows(df_state_pop, df_us_pop) %>%
select(-state_name)
rm(df_state_pop, df_us_pop)
# Add county-level data and aggregate to state, drop all percent variables
df_state <- df %>%
filter(!str_detect(variable_name, "pct|pth")) %>%
group_by(category, topic_area, year, state_name, variable_name) %>%
summarise(value = sum(value)) %>%
left_join(state)
# Add county-level data and aggregate to US, drop all percent variables
df_us <- df %>%
filter(!str_detect(variable_name, "pct|pth")) %>%
group_by(category, topic_area, year, variable_name) %>%
summarise(value = sum(value)) %>%
mutate(fips = "00")
# add state and US data and join population data
df_state_us <- bind_rows(df_state, df_us)
df_state_us <- left_join(pop, df_state_us,
by = join_by("fips", "year")) %>%
filter(!is.na(value))
rm(df_state, df_us, pop)
# Add percent and per 1,000 pop at the state and U.S. level
df_state_us_pct <- df_state_us %>%
mutate(
variable_name = case_when(
str_detect(variable_name, "laccess") ~ str_c("pct_", variable_name),
TRUE ~ str_c(variable_name, "pth")),
value = case_when(
str_detect(variable_name, "laccess") ~ value/population,
TRUE ~ value/(population/1000))) %>%
select(-population)
df_state_us <- df_state_us %>%
select(-population) %>%
bind_rows(df_state_us_pct)
# Bind county, state and national data
df <- bind_rows(df, df_state_us)
rm(df_state_us, df_state_us_pct)
# Add columns and re-order for final format
df <- df %>%
mutate(
value_codes = NA) %>%
select(
fips, county_name, state_name, category,
topic_area, year, variable_name, value, value_codes)
# Redefine meta variable names
meta <- meta %>%
mutate(
variable_name = str_sub(variable_name, 1, -3))
# Define data frame
df_FEatlas <- df
meta_FEatlas <- meta
rm(df, meta)
```
## Hours worked
We gather county-level data on mean usual hours worked from the [U.S. Census Bureau, American Community Survey, B23020 mean usual hours worked in the past 12 months for workers 16-64 years, 5-year estimate](https://data.census.gov/table/ACSDT5Y2020.B23020?q=mean%20hours%20worked).
We download data from 2012-2022. (Data are available starting in 2010, if you want data from prior years). We select Geography, State, All States within United States, Puerto Rico, and the Island Areas. Data from this selection are provided at the U.S., state and county levels.
```{r}
# Import meta data
data_path <- list.dirs(path = "data_raw/foodaccess/ACS_hours")
files <- list.files(path = data_path,
pattern = "Metadata",
full.names = TRUE)
meta <- map(files,
~read_csv(.x,
show_col_types = FALSE)) %>%
set_names(basename(files)) %>%
enframe(name = "year", value = "data") %>%
unnest(cols = data) %>%
mutate(year = str_remove(year, "ACSDT5Y"),
year = str_remove(year, ".B23020-Column-Metadata.csv")) %>%
filter(str_detect(Label, "Estimate")) %>%
rename(variable_code = `Column Name`,
variable_description = Label)
# Keep only necessary elements of variable description
meta <- meta %>%
mutate(
gender = tolower(str_extract(variable_description,
"Male|Female")),
gender = case_when(
is.na(gender) ~ "total",
TRUE ~ gender),
user_friendly_variable_name = str_c("Hours worked per week, average for workers 16-64, in the past 12 months, ", gender))
# Define variable names and definition
meta <- meta %>%
mutate(
variable_name = str_c("average_usual_hours_", gender),
variable_definition =
str_c("Average usual hours worked in a week during the past 12 months for workers 16 to 64 years, ", gender))
# Get metadata to use to define elements to keep in data
metadf <- meta %>%
select(year, variable_code, variable_name)
# Get list of years for meta data with a "|" between and add to metadata
years <- meta %>%
distinct(year) %>%
pull(year) %>%
paste(collapse = "|") %>%
as_tibble() %>%
rename(years = value)
# keep only distinct entries and add years
meta <- meta %>%
select(-year) %>%
distinct() %>%
bind_cols(years)
# Define metadata add additional variables
meta <- meta %>%
mutate(
category = "Labor",
topic_area = "Food Preparation",
periodicity = "yearly",
aggregation = "mean",
format = "integer",
keywords = "American Community Survey|ACS|Census Bureau",
hashtags = "#labormarket|#gender",
chart_type1 = "BarChart",
chart_type2 = "LineChartSeries",
chart_axis_x1 = str_c("Average usual hours, ", gender),
chart_axis_x2 = NA,
chart_axis_y1 = NA,
chart_axis_y2 = chart_axis_x2,
source = "U.S.Census Bureau, American Community Survey",
url = "https://data.census.gov/table/ACSDT5Y2020.B23020?q=mean%20hours%20worked",
citation = "U.S. Census Bureau, American Community Survey, B23020 mean usual hours worked in the past 12 months for workers 16-64 years, 5-year estimate, county")
# Import data
data_path <- list.dirs(path = "data_raw/foodaccess/ACS_hours")
files <- list.files(path = data_path,
pattern = "Data",
full.names = TRUE)
df <- map(files,
~read_csv(.x,
show_col_types = FALSE)) %>%
set_names(basename(files)) %>%
enframe(name = "year", value = "data") %>%
unnest(cols = data) %>%
mutate(year = str_remove(year, "ACSDT5Y"),
year = str_remove(year, ".B23020-Data.csv")) %>%
select(year, GEO_ID, ends_with("E"))
# Get fips from GEO_ID and keep variable of interest
df <- df %>% mutate(
fips = case_when(
str_detect(GEO_ID, "0400000") ~ str_sub(GEO_ID,-2),
str_detect(GEO_ID, "0100000") ~ "00",
str_detect(GEO_ID, "0500000") ~ str_sub(GEO_ID, -5)))
# For each year, there is a GEO_ID "Geography" without values,
# we drop these rows
df <- df %>% filter(
GEO_ID != "Geography") %>%
select(-GEO_ID, -NAME)
# Pivot longer
df <- df %>% pivot_longer(
cols = !c(fips, year),
names_to = "variable_code",
values_to = "value")
# Convert "N", null", "-", "+" to NA and add value_codes for data not disclosed ("D")
df <- df %>%
mutate(
value_codes = case_when(
value == "N" ~ "D",
TRUE ~ NA),
value = str_remove_all(value, ","),
value = str_remove_all(value, "\\-"),
value = str_remove_all(value, "\\+"),
value = case_when(
value == "null" | value=="-" ~ NA,
TRUE ~ value),
value = as.numeric(value))
# Join data and meta data
df <- metadf %>%
left_join(df, by = c("year", "variable_code"))
# Divide women's income as a percent of men's by 100 so in percent format
df <- df %>%
mutate(
value = case_when(
str_detect(variable_name, "percentage") ~ value/100,
TRUE ~ value))
# Add in variables and put in correct order
df <- df %>%
left_join(county_state) %>%
mutate(
category = "Labor",
topic_area = "Food Preparation") %>%
select(
fips, county_name, state_name, category,
topic_area, year, variable_name, value, value_codes)
# Define dataframe and metadata
df_meanhours <- df
meta_meanhours <- meta
```
## Travel time to work
We gather data from the [U.S. Census Bureau, American Community Survey, B08303 Travel Time to Work, 5-year estimate, county](https://data.census.gov/cedsci/table?q=b08303&tid=ACSDT5Y2020.B08303). We download data from 2012-2022. (Data are available starting in 2010, if you want data from prior years). We select Geography, County, All Counties within United States and Puerto Rico. These data include both state and county-level data. We aggregate state data to get national totals.
From these data, we calculate percentage of workers within each category of travel time to work.
We convert "null" to NA and convert to numeric, there is no information on if this means anything beyond not available.
```{r}
# Import metadata for variables of interest
data_path <- list.dirs(path = "data_raw/foodaccess/ACS_traveltime")
files <- list.files(path = data_path,
pattern = "Metadata",
full.names = TRUE)
meta <- map(files,
~read_csv(.x,
show_col_types = FALSE)) %>%
set_names(basename(files)) %>%
enframe(name = "year", value = "data") %>%
unnest(cols = data) %>%
mutate(year = str_remove(year, "ACSDT5Y"),
year = str_remove(year, ".B08303-Column-Metadata.csv")) %>%
filter(!str_detect(`Column Name`, "M$")) %>%
rename(variable_code = `Column Name`,
variable_description = Label)
# Keep only necessary elements of variable description
meta <- meta %>%
mutate(
variable_description = str_remove(variable_description,
"Estimate!!Total!!"),
variable_description = str_remove(variable_description,
"Estimate!!"),
variable_description = str_remove(variable_description,
":"),
variable_description = tolower(str_remove(variable_description,
"Total!!"))) %>%
filter(!(variable_description %in% c("geography",
"geographic area name")))
# Define variable names and definitions
meta <- meta %>%
mutate(
variable_name =
str_replace_all(variable_description,
", ", "_"),
variable_name = str_replace_all(variable_name,
" ", "_"),
variable_name = str_c("travel_time_to_work_", variable_description),
user_friendly_variable_name = str_c("Travel time to work, ",
variable_description),
variable_definition = case_when(
variable_description == "total" ~
"Number of workers, 16 years of age and over, who did not work from home",
TRUE ~ str_c("Number of workers, 16 years of age and over, with travel time to work ", variable_description)))
# Add meta data for each category as a percent of total
meta_pct <- meta %>%