-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path04_indicator-species-analysis.Rmd
227 lines (185 loc) · 9.28 KB
/
04_indicator-species-analysis.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
---
editor_options:
chunk_output_type: console
---
# Indicator species analysis
This analysis aims to identify what species are “indicators” of groups of samples or treatment type and asks if this varies between point count estimates and acoustic surveys.
If the aim is to determine which species can be used as indicators of certain site group, an approach commonly used in ecology is the Indicator Value (Dufrene and Legendre, 1997). These authors defined an Indicator Value (IndVal) index to measure the association between a species and a site group. The method of Dufrene and Legendre (1997) calculates the IndVal index between the species and each site group and then looks for the group corresponding to the highest association value. Finally, the statistical significance of this relationship is tested using a permutation test.
Diagnostic (or indicator) species are an important tool in vegetation science, because these species can be used to characterize and indicate specific plant community types. A statistic commonly used to determine the association (also known as fidelity, not to be confounded with the indicator value component) between species and vegetation types is Pearson’s phi coefficient of association (Chytry et al., 2002). This coefficient is a measure of the correlation between two binary vectors. The abundance-based counterpart of the phi coefficient is called the point biserial correlation coefficient (which is defined as "r.g").
See: https://cran.r-project.org/web/packages/indicspecies/vignettes/indicspeciesTutorial.pdf
## Load necessary libraries
```{r}
library(tidyverse)
library(dplyr)
library(stringr)
library(vegan)
library(ggplot2)
library(scico)
library(psych)
library(ecodist)
library(RColorBrewer)
library(ggforce)
library(ggalt)
library(patchwork)
library(sjPlot)
library(indicspecies)
# Source any custom/other internal functions necessary for analysis
source("code/01_internal-functions.R")
```
## Load dataframe containing point count and acoustic data
```{r}
datSubset <- read.csv("results/datSubset.csv")
```
## Load species trait data
```{r}
trait <- read.csv("data/species-trait-dat.csv")
# add it to the subset data
datSubset <- left_join(datSubset,trait[,c(1,7,8)], by = "scientific_name")
```
## Estimate detections across visits for point count data and acoustic data
These detections are calculate at the site level (for a total of five/six visits for acoustic & point count data, respectively). If the across_visit_detections = 5, that means that a species was detected every single time across each of the five visits to that site. This value ranges from 1 to 5 (or 1 to 6 for point count data). These detections are estimated for the indicator species analysis since the matrix required for the same cannot be a presence/absence matrix.
```{r}
## we will estimate abundance across point counts by site-date (essentially corresponding to visit)
abundance <- datSubset %>%
filter(data_type == "point_count") %>%
group_by(date, site_id, restoration_type, scientific_name,
common_name, eBird_codes, habitat, foraging_habit) %>% summarise(totAbundance = sum(number)) %>%
ungroup()
# estimate across visit detections for point count data
pc_visit_detections <- abundance %>%
mutate(forDetections = case_when(totAbundance > 0 ~ 1)) %>%
group_by(scientific_name, site_id,restoration_type) %>%
summarise(across_visit_detections = sum(forDetections)) %>%
mutate(data_type = "point_count") %>%
ungroup()
# estimate total number of detections across the acoustic data by site-date (essentially corresponds to a visit)
# note: we cannot call this abundance as it refers to the total number of vocalizations across a 16-min period across all sites
detections <- datSubset %>%
filter(data_type == "acoustic_data") %>%
group_by(date, site_id, restoration_type, scientific_name,
common_name, eBird_codes, habitat, foraging_habit) %>% summarise(totDetections = sum(number)) %>%
ungroup()
# estimate across visit detections for acoustic data
aru_visit_detections <- detections %>%
mutate(forDetections = case_when(totDetections > 0 ~ 1)) %>%
group_by(scientific_name, site_id,restoration_type) %>%
summarise(across_visit_detections = sum(forDetections)) %>%
mutate(data_type = "acoustic_data") %>%
ungroup()
```
## Indicator species analysis of the point-count dataset
The detections across visits is converted to a wide format for the sake of analysis.
```{r}
## preparing the point count data for the indicator species analysis
pc_indicator <- pc_visit_detections %>%
group_by(scientific_name, site_id, restoration_type) %>%
pivot_wider(names_from = scientific_name,
values_from = across_visit_detections,
values_fill = list(across_visit_detections=0))
## indicator species analysis
indic_pc <- multipatt(pc_indicator[,4:ncol(pc_indicator)], pc_indicator$restoration_type, func = "r.g", control = how(nperm=999))
# analyze/summarize the results
summary(indic_pc)
```
Multilevel pattern analysis for point count data
Association function: r.g
Significance level (alpha): 0.05
Total number of species: 83
Selected number of species: 25
Number of species associated to 1 group: 16
Number of species associated to 2 groups: 9
List of species associated to each combination:
Group BM #sps. 13
stat p.value
Alcippe poioicephala 0.812 0.001 ***
Culicicapa ceylonensis 0.739 0.001 ***
Hypothymis azurea 0.665 0.001 ***
Pellorneum ruficeps 0.599 0.001 ***
Ducula badia 0.596 0.001 ***
Harpactes fasciatus 0.565 0.001 ***
Cyornis pallidipes 0.561 0.001 ***
Leptocoma minima 0.531 0.002 **
Phylloscopus magnirostris 0.505 0.004 **
Irena puella 0.457 0.005 **
Dicrurus paradiseus 0.439 0.010 **
Chalcophaps indica 0.380 0.032 *
Dicrurus aeneus 0.376 0.029 *
Group NR #sps. 3
stat p.value
Cinnyris asiaticus 0.702 0.001 ***
Acrocephalus dumetorum 0.507 0.003 **
Machlolophus aplonotus 0.396 0.021 *
Group AR+BM #sps. 2
stat p.value
Muscicapa muttui 0.509 0.003 **
Iole indica 0.498 0.003 **
Group AR+NR #sps. 7
stat p.value
Streptopelia chinensis 0.744 0.001 ***
Pycnonotus jocosus 0.744 0.001 ***
Dicaeum concolor 0.474 0.002 **
Psittacula columboides 0.403 0.020 *
Zosterops palpebrosus 0.394 0.025 *
Orthotomus sutorius 0.389 0.026 *
Psittacula cyanocephala 0.376 0.023 *
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 .
## Indicator species analysis for acoustic data
The detections across visits is converted to a wide format for the sake of analysis.
```{r}
## preparing the acoustic data for the indicator species analysis
aru_indicator <- aru_visit_detections %>%
group_by(scientific_name, site_id, restoration_type) %>%
pivot_wider(names_from = scientific_name,
values_from = across_visit_detections,
values_fill = list(across_visit_detections=0))
## indicator species analysis
indic_aru <- multipatt(aru_indicator[,4:ncol(aru_indicator)], aru_indicator$restoration_type, func = "r.g", control = how(nperm=999))
# analyze/summarize the results
summary(indic_aru)
```
Multilevel pattern analysis for the acoustic dataset
Association function: r.g
Significance level (alpha): 0.05
Total number of species: 113
Selected number of species: 29
Number of species associated to 1 group: 19
Number of species associated to 2 groups: 10
List of species associated to each combination:
Group AR #sps. 1
stat p.value
Chloropsis aurifrons 0.396 0.044 *
Group BM #sps. 12
stat p.value
Alcippe poioicephala 0.666 0.001 ***
Hypothymis azurea 0.612 0.001 ***
Phylloscopus magnirostris 0.602 0.001 ***
Cyornis pallidipes 0.601 0.001 ***
Ducula badia 0.468 0.009 **
Culicicapa ceylonensis 0.460 0.005 **
Sitta frontalis 0.459 0.007 **
Harpactes fasciatus 0.454 0.012 *
Dryocopus javensis 0.430 0.007 **
Muscicapa muttui 0.412 0.009 **
Leptocoma minima 0.395 0.030 *
Picumnus innominatus 0.346 0.042 *
Group NR #sps. 6
stat p.value
Streptopelia chinensis 0.554 0.002 **
Orthotomus sutorius 0.503 0.002 **
Tephrodornis sylvicola 0.467 0.002 **
Carpodacus erythrinus 0.421 0.015 *
Phylloscopus affinis 0.408 0.020 *
Lonchura kelaarti 0.367 0.021 *
Group AR+NR #sps. 10
stat p.value
Pycnonotus jocosus 0.830 0.001 ***
Acrocephalus dumetorum 0.543 0.002 **
Corvus splendens 0.507 0.003 **
Psittacula cyanocephala 0.447 0.007 **
Merops leschenaulti 0.441 0.009 **
Dicrurus leucophaeus 0.438 0.014 *
Copsychus saularis 0.430 0.016 *
Pavo cristatus 0.426 0.014 *
Halcyon smyrnensis 0.403 0.015 *
Dinopium benghalense 0.386 0.028 *
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 .