-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path06_abundance-densities-vocalizationRates.Rmd
399 lines (327 loc) · 15.8 KB
/
06_abundance-densities-vocalizationRates.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
---
editor_options:
chunk_output_type: console
---
# Abundance vs. acoustic detection rates
In this script, we model correlations and regressions between raw abundance (as estimated via point count data) and acoustic detection rates (calculated from acoustic data). In addition, we repeat the above analysis by considering distance based measures of densities (refer to script_05).
Here, abundance corresponds to the total number of individuals of a species detected across visits and sites and can only be calculated for point count data.
Here, we will calculate a measure which we define as acoustic detection rate. Acoustic detection rate is calculated as the number of vocalizations/detections in each 10-s audio file over the total number of 10-s audio files analyzed across visits and sites. An example for the same is provided below:
Each site can have around 4 to 5 16-min acoustic surveys. This would mean that the total number of 10-s clips for a site would range from 384 to 480 (16-minutes = 960s = a total of 96 10-s audio files. Across 4 or 5 visits, this number would be 96x4 or 96x5).
Hence, the acoustic detection rate for each species at a site is defined as (x number of vocalizations or acoustic detections)/(total number of 10-s clips analyzed).
## Install necessary libraries
```{r}
library(tidyverse)
library(dplyr)
library(stringr)
library(vegan)
library(ggplot2)
library(scico)
library(data.table)
library(extrafont)
library(ggstatsplot)
library(ggside)
# Source any custom/other internal functions necessary for analysis
source("code/01_internal-functions.R")
```
## Load existing data
```{r}
datSubset <- read.csv("results/datSubset.csv")
density <- read.csv("results/density-estimates.csv")
```
## Estimate abundance from point count data and calculate vocalization rates from acoustic data
```{r}
# point-count data
# estimate total abundance of all species for each site across visits
abundance <- datSubset %>%
filter(data_type == "point_count") %>%
group_by(site_id, scientific_name,
common_name, eBird_codes) %>%
summarise(abundance_pc = sum(number)) %>%
ungroup()
# for acoustic data, we will first estimate the number of visits to a site, which will essentially translate to the number of 10-s clips that were analyzed/heard (used here as the time period for the calculation of vocalization rates)
# nVisits is calculated here
# except for INBS04U, all other sites had a total of 5 visits, while INBS04U had 4 visits
nSitesDays <- datSubset %>%
filter(data_type == "acoustic_data") %>%
dplyr::select(site_id, date)%>%
distinct() %>% arrange(site_id) %>% count(site_id) %>%
rename(nVisits = n)
# estimate total number of detections across the acoustic data
# note: we cannot call this abundance as it refers to the total number of vocalizations across all sites
detections <- datSubset %>%
filter(data_type == "acoustic_data") %>%
group_by(site_id, scientific_name,
common_name, eBird_codes) %>%
summarise(detections_aru = sum(number)) %>%
ungroup()
# estimating acoustic detection rates for each species for each site
aruRate <- detections %>%
left_join(., nSitesDays, by = "site_id") %>%
mutate(nClips = nVisits*96) %>%
mutate(aruRate = detections_aru/nClips)
# Note that the column aruRate can vary between 0 to 1 for each species for each site (this value can vary across sites for each species, referring to how vocally active a species is)
```
## Correlations between acoustic detection rates and abundance from point count data
Here, we correlate abundance from point count data with acoustic detection rates from audio data.
```{r}
# create a single dataframe
data <- full_join(abundance, aruRate)%>%
replace_na(list(abundance_pc = 0, detections_aru = 0,
nVisits = 0, nClips = 0, aruRate = 0))
# previously, we subset species to only include those had a minimum abundance of 20 in point counts. We carry out a similar analysis to only include species that had a minimum of 20 detections in point count data and 20 detections in audio data.
spp_subset <- data %>%
group_by(common_name) %>%
summarise(abundance_pc = sum(abundance_pc), detections_aru = sum(detections_aru)) %>%
ungroup() %>%
filter(abundance_pc >=20 & detections_aru >= 20)
# subset data
# we have 45 species in total
data <- data %>%
filter(common_name %in% spp_subset$common_name)
## join the density data to the above object
data <- left_join(data, density,
by = c("site_id","common_name"))
## remove rows which have no data in either point count or acoustics
data <- data %>%
filter(abundance_pc != 0) %>%
filter(aruRate != 0)
# visualization
fig_abund_aruRate_cor <- ggscatterstats(
data = data,
x = abundance_pc,
y = aruRate,
type = "r",
xlab = "Abundance (point-count)",
ylab = "Acoustic detection rate",
ggplot.component = list(theme(text = element_text(family = "Century Gothic", size = 15, face = "bold"),plot.title = element_text(family = "Century Gothic",
size = 18, face = "bold"),
plot.subtitle = element_text(family = "Century Gothic",
size = 15, face = "bold",color="#1b2838"),
axis.title = element_text(family = "Century Gothic",
size = 15, face = "bold"))))
# Statistically significant and positive correlations were observed between acoustic detection rates and abundance from point counts when data for all 45 species across sites and visits were pooled
ggsave(fig_abund_aruRate_cor, filename = "figs/fig_aruRate_vs_abundance_correlations.png", width = 14, height = 10, device = png(), units = "in", dpi = 300)
dev.off()
## grouping data across sites (rerunning correlations at the species-level)
# grouping point count data
species_group <- data %>%
group_by(scientific_name,
common_name, eBird_codes) %>%
summarise(abundance_pc = sum(abundance_pc),
aruRate = sum(aruRate)) %>%
ungroup()
# visualization
fig_abund_aruRate_species <- ggscatterstats(
data = trial,
x = abundance_pc,
y = aruRate,
type = "r",
xlab = "Abundance (point-count)",
ylab = "Acoustic detection rate",
label.var = common_name,
label.expression = abundance_pc > 200,
point.label.args = list(alpha = 0.7, size = 4, color = "grey50"),
ggplot.component = list(theme(text = element_text(family = "Century Gothic", size = 15, face = "bold"),plot.title = element_text(family = "Century Gothic",
size = 18, face = "bold"),
plot.subtitle = element_text(family = "Century Gothic",
size = 15, face = "bold",color="#1b2838"),
axis.title = element_text(family = "Century Gothic",
size = 15, face = "bold"))))
ggsave(fig_abund_aruRate_species, filename = "figs/fig_aruRate_vs_abundance_correlations_speciesLevel.png", width = 14, height = 10, device = png(), units = "in", dpi = 300)
dev.off()
```


## Correlations between acoustic detection rates and densities (from distance based analysis)
```{r}
# visualization
fig_density_aruRate_cor <- ggscatterstats(
data = data,
x = density,
y = aruRate,
type = "r",
xlab = "Densities",
ylab = "Acoustic detection rate",
ggplot.component = list(theme(text = element_text(family = "Century Gothic", size = 15, face = "bold"),plot.title = element_text(family = "Century Gothic",
size = 18, face = "bold"),
plot.subtitle = element_text(family = "Century Gothic",
size = 15, face = "bold",color="#1b2838"),
axis.title = element_text(family = "Century Gothic",
size = 15, face = "bold"))))
# Statistically significant and positive correlations were observed between acoustic detection rates and densities from point counts when data for all 45 species across sites and visits were pooled. However, the correlative score is far lower when compared to the previous analusis
ggsave(fig_density_aruRate_cor, filename = "figs/fig_aruRate_vs_density_correlations.png", width = 14, height = 10, device = png(), units = "in", dpi = 300)
dev.off()
```
## Regressions between abundance and vocalization rates
```{r}
data <- setDT(data)
# extract t-value
data[, t_value := summary(lm(aruRate ~ abundance_pc))$coefficients[6], by = common_name]
# extract slope
data[, slope := lm(aruRate ~ abundance_pc)%>% coef()%>% nth(2), by = common_name]
# extract pearson's correlation
data[, pearson := cor(aruRate, abundance_pc), by = common_name]
# extract adjusted r squared
data[, r_sq := summary(lm(aruRate ~ abundance_pc))$adj.r.squared, by = common_name]
# create a column with the direction of the slope (whether it is positive or negative), which can be referred to later while plotting
data[, slope_dir := ifelse(slope >0, '+', '-')]
paste("Positive regressions:",length(unique(data$common_name[data$slope_dir %in% c('+')])))
# 39 species had a positive regression or slope value
## visualization
fig_aruRate_abund_reg <- ggplot(data, aes(y = aruRate,
x = abundance_pc)) +
geom_point(color = "#9CC3D5",size = 1.2) +
geom_smooth(data = data, aes(group = common_name,
color = slope_dir),
method = 'lm', se = FALSE,
linewidth = 0.7) +
scale_color_manual(values=c("#1B9E77", "#D95F02")) +
labs(y="\nAcoustic detection rate",
x="Abundance (from point count data)\n") +
theme_bw() +
annotate("text", x=13, y=0.9,
label= "Slope:", col = "grey30", size = 12) +
annotate("text", x=16, y=0.9,
label= "+", col = "#D95F02", size = 12) +
annotate("text", x = 18, y=0.9,
label = "-", col = "#1B9E77", size = 12)+
theme(text = element_text(family = "Century Gothic", size = 18, face = "bold"),plot.title = element_text(family = "Century Gothic",
size = 18, face = "bold"),
plot.subtitle = element_text(family = "Century Gothic",
size = 15, face = "bold",color="#1b2838"),
axis.title = element_text(family = "Century Gothic",
size = 18, face = "bold"),
legend.position = "none")
ggsave(fig_aruRate_abund_reg, filename = "figs/fig_aruRate_vs_abundance_regressions.png", width = 14, height = 12, device = png(), units = "in", dpi = 300)
dev.off()
# extract the slope, t_value, pearson correlation and the adjusted r square
lm_output <- data %>%
dplyr::select(common_name, t_value, slope, pearson, slope_dir,r_sq) %>% distinct()
# write the values to file
write.csv(lm_output, "results/aruRates-abundance-regressions.csv",
row.names = F)
```

## Plotting species-specific regression plots of abundance and acoustic detection rate
```{r}
# visualization
plots <- list()
for(i in 1:length(unique(data$common_name))){
# extract species common name
a <- unique(data$common_name)[i]
# subset data for plotting
for_plot <- data[data$common_name==a,]
# create plots
plots[[i]] <- ggplot(for_plot, aes(y = aruRate,
x = abundance_pc)) +
geom_point(color = "#9CC3D5",size = 1.2) +
geom_smooth(aes(color = "#D95F02"),
method = 'lm', se = TRUE,
linewidth = 0.7) +
labs(title = paste0(a," ","r_sq = ", signif(for_plot$r_sq, digits = 2), " ", paste0("slope = ",signif(for_plot$slope, digits = 4))),
y="\nVocalization rates (from acoustic data)",
x="Abundance (from point count data)\n") +
theme_bw() +
theme(text = element_text(family = "Century Gothic", size = 18, face = "bold"),plot.title = element_text(family = "Century Gothic",
size = 18, face = "bold"),
plot.subtitle = element_text(family = "Century Gothic",
size = 15, face = "bold",color="#1b2838"),
axis.title = element_text(family = "Century Gothic",
size = 18, face = "bold"),
legend.position = "none")
}
# plot and save as a single pdf
cairo_pdf(
filename = "figs/aruRate-abundance-by-species-regressions.pdf",
width = 13, height = 12,
onefile = TRUE
)
plots
dev.off()
```
## Regressions between acoustic detection rate and species densities (based on distance analysis)
```{r}
data <- setDT(data)
# extract t-value
data[, t_value := summary(lm(aruRate ~ density))$coefficients[6], by = common_name]
# extract slope
data[, slope := lm(aruRate ~ density)%>% coef()%>% nth(2), by = common_name]
# extract pearson's correlation
data[, pearson := cor(aruRate, density), by = common_name]
# extract adjusted r squared
data[, r_sq := summary(lm(aruRate ~ density))$adj.r.squared, by = common_name]
# create a column with the direction of the slope (whether it is positive or negative), which can be referred to later while plotting
data[, slope_dir := ifelse(slope >0, '+', '-')]
paste("Positive regressions:",length(unique(data$common_name[data$slope_dir %in% c('+')])))
# 39 species had a positive regression or slope value
## visualization
fig_aruRate_density_reg <- ggplot(data, aes(y = aruRate,
x = density)) +
geom_point(color = "#9CC3D5",size = 1.2) +
geom_smooth(data = data, aes(group = common_name,
color = slope_dir),
method = 'lm', se = FALSE,
linewidth = 0.7) +
scale_color_manual(values=c("#1B9E77", "#D95F02")) +
labs(y="\nAcoustic detection rate",
x="Species densities\n") +
theme_bw() +
annotate("text", x=0.5, y=0.9,
label= "Slope:", col = "grey30", size = 12) +
annotate("text", x=0.7, y=0.9,
label= "+", col = "#D95F02", size = 12) +
annotate("text", x = 0.9, y=0.9,
label = "-", col = "#1B9E77", size = 12)+
theme(text = element_text(family = "Century Gothic", size = 18, face = "bold"),plot.title = element_text(family = "Century Gothic",
size = 18, face = "bold"),
plot.subtitle = element_text(family = "Century Gothic",
size = 15, face = "bold",color="#1b2838"),
axis.title = element_text(family = "Century Gothic",
size = 18, face = "bold"),
legend.position = "none")
ggsave(fig_aruRate_density_reg, filename = "figs/fig_aruRate_vs_density_regressions.png", width = 14, height = 12, device = png(), units = "in", dpi = 300)
dev.off()
# extract the slope, t_value, pearson correlation and the adjusted r square
lm_output <- data %>%
dplyr::select(common_name, t_value, slope, pearson, slope_dir,r_sq) %>% distinct()
# write the values to file
write.csv(lm_output, "results/aruRates-density-regressions.csv",
row.names = F)
```
## Plotting species-specific regression plots of density and acoustic detection rate
```{r}
# visualization
plots <- list()
for(i in 1:length(unique(data$common_name))){
# extract species common name
a <- unique(data$common_name)[i]
# subset data for plotting
for_plot <- data[data$common_name==a,]
# create plots
plots[[i]] <- ggplot(for_plot, aes(y = aruRate,
x = density)) +
geom_point(color = "#9CC3D5",size = 1.2) +
geom_smooth(aes(color = "#D95F02"),
method = 'lm', se = TRUE,
linewidth = 0.7) +
labs(title = paste0(a," ","r_sq = ", signif(for_plot$r_sq, digits = 2), " ", paste0("slope = ",signif(for_plot$slope, digits = 4))),
y="\nVocalization rates (from acoustic data)",
x="Species densities (from point count data)\n") +
theme_bw() +
theme(text = element_text(family = "Century Gothic", size = 18, face = "bold"),plot.title = element_text(family = "Century Gothic",
size = 18, face = "bold"),
plot.subtitle = element_text(family = "Century Gothic",
size = 15, face = "bold",color="#1b2838"),
axis.title = element_text(family = "Century Gothic",
size = 18, face = "bold"),
legend.position = "none")
}
# plot and save as a single pdf
cairo_pdf(
filename = "figs/aruRate-density-by-species-regressions.pdf",
width = 13, height = 12,
onefile = TRUE
)
plots
dev.off()
```