-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathMainText.Rmd
1177 lines (1080 loc) · 72.3 KB
/
MainText.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
---
title: "A new approach to interspecific synchrony in population ecology using tail association"
author: "Shyamolina Ghosh, Lawrence W. Sheppard, Philip C. Reid, Daniel Reuman"
fontsize: 12 pt
geometry: "left=1 in,right=1 in,top=1 in,bottom=1 in"
output:
pdf_document:
number_sections: no
keep_tex: yes
fig_caption: yes
includes:
in_header: head_MainText.sty
mainfont: Times New Roman
tables: True
link-citations: True
urlcolor : blue
indent : True
csl: TheAmericanNaturalist.csl
bibliography: REF_CIS.bib
---
```{r setup_MainText, echo=F}
library(rmarkdown)
knitr::opts_chunk$set(echo = TRUE, fig.pos = "H")
options(scipen = 1, digits = 5) #This option round all numbers appeared in the inline r code upto 5th digit
seed<-101
```
\noindent \emph{Affiliations:}
\noindent Ghosh: Department of Ecology and Evolutionary Biology and Kansas Biological Survey, University of Kansas, Lawrence, KS, 66045, USA
\noindent Sheppard: Department of Ecology and Evolutionary Biology and Kansas Biological Survey, University of Kansas, Lawrence, KS, 66045, USA
\noindent Reid: Continuous Plankton Recorder Survey, The Marine Biological Association, The Laboratory, Citadel Hill, Plymouth PL1 2PB, UK; School of Biological & Marine Sciences, University of Plymouth, Drake Circus, Plymouth PL4 8AA, UK
\noindent Reuman: Department of Ecology and Evolutionary Biology and Kansas Biological Survey, University of Kansas, Lawrence, KS, 66045, USA
\noindent \emph{Correspondence:} Daniel Reuman, 2101 Constant Ave, Lawrence, KS, 66047, reuman@ku.edu, 626 560 7084.
\noindent \emph{Short title/ Running head:} Interspecific synchrony and tail association
\newpage
# Abstract
\noindent Standard methods for studying the association between two ecologically important variables
provide only a small slice of the information content of the association, but
<!--DAN CHANGED: methods--> statistical approaches<!--END CHANGES--> are available<!--DAN CHANGED:, based on *copulas*,--><!--END CHANGES--> that provide comprehensive information.
In particular,<!--DAN CHANGED:copula--> available <!--END CHANGES-->approaches can reveal
*tail associations*, i.e., accentuated or reduced associations
between the more extreme values of variables. We here study the nature
and causes of tail associations between phenological or population-density
variables of co-located species, and their ecological importance. We employ a
simple method of measuring tail associations which we call the *partial Spearman correlation*.
Using multidecadal, multi-species spatiotemporal datasets on aphid first flights and marine
phytoplankton population densities, we assess the potential for tail association
to illuminate two major topics of study in community ecology: the stability or instability of
aggregate community measures such as total community biomass and its relationship
with the synchronous or compensatory dynamics of the community's constituent species;
and the potential for fluctuations and trends in species phenology to
result in trophic mismatches. We find that positively associated fluctuations
in the population densities of co-located species commonly show asymmetric tail
associations, i.e., it is common for two species' densities
to be more correlated when large than when small, or vice versa. Ordinary measures
of association such as correlation do not take this asymmetry into account. Likewise,
positively associated fluctuations in the phenology of co-located species
also commonly show asymmetric tail associations. We provide evidence that tail associations
between two or more species' population density or phenology time series can be inherited
from mutual tail associations of these quantities with an environmental driver. We argue
that our understanding of community dynamics and stability, and of
phenologies of interacting species, can be meaningfully improved in future work by taking into
account tail associations.
\vspace{.5cm}
\noindent \textbf{\textit{Keywords:}} aphids, copula, inter-species synchrony,
match-mismatch hypothesis, plankton, tail association
\newpage
# Introduction\label{Introduction}
All ecologists study relationships between biological and environmental
variables and among biological variables. But standard methods for studying the association between two variables
provide only a small slice of the information content of the association.
For instance, the two pairs of variables in Fig. \ref{pedagogfig}A, B have identical
Pearson correlation coefficients, and also have identical Spearman correlation coefficients,
but nonetheless display
very different patterns of association [@Ghosh_copula; @ghosh2020tail].<!--***Shya CHANGED: added new citation of our Ecosphere paper.--> Correlations
are not the only way to study associations,
but they are very commonly used, and other standard methods in ecology
provide a similarly limited amount of information that neglects patterns of association
[@nelsen2006_copula; @Genest2007; @joe2014_dependence; @MaiScherer2017; @Anderson2018] that seem likely
to be ecologically important [@Ghosh_copula; @ghosh2020tail].<!--***Shya CHANGED: added new citation of our Ecosphere paper.-->
<!--DAN CHANGED:
The variables of Fig. \ref{pedagogfig}A
(respectively, Fig. \ref{pedagogfig}B) are more strongly related in the left
(respectively, right) portions of their distributions,
thereby displaying asymmetric *tail association*. For two
positively associated variables, *left-tail* (respectively, *right-tail*)
*association* is stronger association
between values in the left or lower portions (respectively, the right or upper
portions) of the two distributions, as in Fig. \ref{pedagogfig}A
(respectively, Fig. \ref{pedagogfig}B). Tail association is
a potentially important pattern of association
that is not captured by standard correlation coefficients. -->
The variables of Fig. \ref{pedagogfig}A
(respectively, Fig. \ref{pedagogfig}B) are more strongly related in the left
(respectively, right) portions of their distributions,
thereby displaying asymmetric associations of the distribution tails, henceforth called
asymmetric *tail association*. For two
positively associated variables, stronger association
between values in the left or lower portions of the distributions of the variables is henceforth
referred to as *left-tail association* (Fig. \ref{pedagogfig}A), whereas stronger association
between values in the right or upper portions of the distributions of the variables is henceforth
referred to as *right-tail association* (Fig. \ref{pedagogfig}B). The word "distribution"
is sometimes omitted from the terminology, but implied. Tail association is
a potentially important pattern of association
that is not captured by standard correlation coefficients.
<!--END MODIFIED TEXT-->
Statistical approaches exist, however, that provide a complete description
of the relationship between variables; these approaches are based on the idea of the *copula*.
Tail associations are an important aspect of a copula approach to dependence,
and tail association will be a focus of this paper.
We here give a conceptual flavor of copulas before subsequently focusing on
tail association. <!--DAN CHANGED:-->We introduce copulas instead of proceeding directly to
tail associations, for three reasons: to properly credit the copula ideas at the root of
our tail association tools, and the researchers who developed them;
to indicate the origin of our tail association tools, so that future researchers
seeking to generalize our approach will have a place to start; and to introduce ideas
(normalized rank plots - see below) that are necessary to define our measures of
tail association.<!--END MODIFIED TEXT--> Copulas can be used to separate the
information content of a bivariate dataset, $(x_t,y_t)$ for $t=1,\ldots,T$,
into two non-overlapping parts: the information in the marginal
distributions (which is not about the association between the variables)
and the rest of the information (which is solely about the association).
Following @Ghosh_copula and @Genest2007, the isolated information about the
association between $x_t$ and $y_t$ is revealed by the plot of $u_t$
against $v_t$, where $u_t$ is the rank of $x_t$ in the set
$\{ x_1,x_2,\ldots,x_T \}$, divided by $T+1$; and $v_t$ is the rank
of $y_t$ in the set
$\{ y_1,y_2,\ldots,y_T \}$, also divided by $T+1$. Here the rank of the smallest
element of a set is understood to be $1$. We refer to the $u_t$ and $v_t$ as *normalized ranks*
of the $x_t$ and $y_t$. We refer to the plot of $v_t$ against $u_t$ as
the *normalized rank plot* for $y_t$ and $x_t$.
For instance, the normalized rank plots for
Fig. \ref{pedagogfig}A,B are in Fig. \ref{pedagogfig}C,D, and show the asymmetric
associations in the tails.
The normalized rank plot reflects the copula structure of
$(x_t, y_t)$ [@Ghosh_copula; @Genest2007].
Ranking makes the marginal distributions uniform, isolating only the information
on association between the variables.<!--DAN CHANGED:--> @Genest2007 states that
inferences about dependence structures should always be based on ranks.<!--END MODIFICATION-->
It is likewise the purpose of copula approaches
to separate association information from information on marginals.
We emphasize that we have not here provided a formal definition of copulas, instead only
introducing the fundamental copula idea of separating dependence information from
information on marginals.
Brief [@Genest2007; @Anderson2018; @Ghosh_copula]
and comprehensive [@nelsen2006_copula; @joe2014_dependence; @MaiScherer2017] introductions
to copulas are available elsewhere. Copulas can also be used to study multivariate data.
Copula approaches are applied
widely and to great effect in fields such as finance and neuroscience [@Li2000; @Kim2008;
@Serinaldi2008; @onken2009; @li2013; @Emura2016; @She2018; @Goswami2018],
but only rarely, so far, in ecology
[@Valpine2014; @Anderson2018; @Popovic2019; @Ghosh_copula; @ghosh2020tail].<!--***Shya CHANGED: new citation of our ecosphere paper added.-->
The potential of copulas for
improving ecological understanding was
argued by @Ghosh_copula, and those authors also introduced tail association as an important aspect
of copula structure, and elaborated the relationship
between tail association and copulas.
The study of @Ghosh_copula was a wide-ranging study of the importance,
causes and consequences of copula structures in associations between
ecological variables. One of the main foci of that paper was
associations between fluctuations through time of
population density or phenological measurements of the same
species in different locations. This study instead focuses on population
density and phenological measurements of different species in the same location.
@Ghosh_copula studied, for instance, associations between first-flight time
series, for a given species of aphid, measured at different locations in
the United Kingdom (UK); and
associations between plankton density time series, for a given plankton taxon,
measured at different locations in seas around the UK. We instead study
associations between first-flight or population density time series
measured in the same location for different (sympatric) species. Thus,
in contrast with the study of @Ghosh_copula, this study
is more part of community ecology than of spatial ecology.
Our reasons for this shift are as follows.
First, *synchronous* (positively correlated)
and *compensatory* (negatively correlated)
population density dynamics
of different species occupying the same area are longstanding
topics of concern in community ecology, with important ramifications
for the stability or instability of aggregate community or ecosystem
properties [@raimondo2004interspecific; @kent2007synchrony;
@loreau2008species; @gonzalez2009causes; @jochimsen2013compensatory];
there are reasons to believe tail associations in this context will play an important
but unstudied role in understanding these topics.
A major past insight into community dynamics [@gonzalez2009causes]
was that an aggregate
property of a community, such as its total biomass, can be relatively
stable through time although its constituent parts (population
biomasses of individual species)
are highly variable, if the parts show compensatory dynamics [@hallett2014].
Likewise,
synchrony amplifies community biomass variability because
the concordant variations of species biomass time series reinforce each
other in the total [@Ma2017]. If synchronous fluctuations show right-tail association,
then species are highly abundant simultaneously, which may produce
years of extremely high community biomass. Alternatively, if synchronous
fluctuations show left-tail association, species are very scarce simultaneously,
potentially producing years of extremely low community biomass.
Thus the tail association of synchrony, not just the presence and strength
of synchrony, may independently influence temporal variability of
aggregate community properties. This is revisited in the \nameref{Discussion}.
Second, studies of the phenology of species interacting in one area
have also played a central role in community ecology, with
important ramifications for whether and to what extent interactions
will be modified by climate change [@Durant2007; @Yang2010];
there are reasons to believe tail
associations between variables in this context may play an
important role, as well.
As climate changes and phenologies shift, there is the potential for
phenologies of interacting species to shift differently, disrupting
the interaction [@Thackery2010]. This idea is referred to as the match-mismatch
hypothesis. Even if, for instance, year-to-year fluctuations in the emergence times
of two interacting species are
highly correlated, if this correlation is principally in the
right (respectively, left) tails of the distributions of possible emergence times,
so that early (respectively, late) emergences of the species are actually uncorrelated, then
mismatched years are likely to occur, impacting the species.
Such mismatches will occur, in this conceptual example,
when emergence is early (respectively, late).
Essentially, even with substantial correlation between emergence dates of species, if this correlation
is principally in one of the tails, then uncorrelated emergences, and therefore mismatches,
can occur under some conditions.
One potential mechanism by which early emergences, for example,
may be uncorrelated between species while later emergences remain correlated
is if both species follow the same environmental cue for their
emergence, but physiological limitations of only one of the species prevent
emergence before a certain date. Advancing emergence dates of myriad species
make this scenario more plausible.
We here begin exploring whether tail associations may be important for
studies of synchrony and compensatory dynamics, and for studies
of phenology and the match-mismatch hypothesis. We use a
56-year dataset of population densities of 4 species of dinoflagellates
from the *Certaium* genus, from 15 locations in the seas around the
UK; and a 35-year dataset of annual first-flight dates
for 20 species of aphid from 10 locations within the UK.
The terms left- and right-tail
association, defined above, do not apply to negatively associated
variables, because the
negative association means values in the left tail
of one variable are associated with those in the right tail of the other;
slightly modified methods are required to study tail association and its
asymmetry in negatively associated variables.
But our aphid and plankton
population and phenology variables were
almost exclusively positively associated with each other (see \nameref{Results}).
Therefore, we introduce methods and present results in this study
chiefly for the case of positively associated variables,
returning to the topics of negatively
associated variables and compensatory population dynamics
in the \nameref{Discussion}.
In addition to examining whether tail association in our data is
asymmetric, we also test for possible causes of such patterns.
One possible mechanism, similar to some of the mechanisms
explored by @Ghosh_copula,
is explained for the *Ceratium* example as
follows. Earlier work showed that average sea surface temperature
is an important
correlate of phytoplankton abundance in our data
[e.g., @defriez2016climate; @sheppard2017; @sheppard2019]:
cold water is associated with more phytoplankton, likely because upwelling and
mixing of the surface and deeper ocean layers bring both
nutrients and cold water to the photic zone.
However, if it is the case for a given location
that very cold water is associated with no more
*Ceratium*, on average, than is moderately cold water, then
that corresponds to a positive relationship and a left-tail association
between the "coldness" of the surface water (measured, for instance,
by how many degrees colder the water is than average)
and *Ceratium* abundance. If such tail association is strong and consistent
across *Ceratium* species, it should
produce positive relationships with left-tail association between
the abundance time series of the species. Likewise,
in locations for which the winter coldness-*Ceratium* abundance association
shows less left-tail association, one should see less
left-tail association between different *Ceratium* species.
So tail association between two species may be inherited from joint tail
association of both species on a common environmental driver.
Phytoplankton are also strongly influenced by the abundant generalist copepod
consumer *Calanus finmarchicus*, so our actual investigation of the mechanism
proposed here will take into account this influence as well as the
association with sea surface temperature. For aphid first flight, we examine
the same potential mechanism, but the relevant driver in that case is
winter temperature.
Thus this paper focuses on whether and why population density
or phenological time series of co-located species may show asymmetric
patterns in their tail-associations, with a focus on positively associated variables
because positive associations are what occurred in the
available data. We ask the following specific questions.
(Q1) Do synchronous/positively correlated population density or phenological time series
of co-located species commonly show asymmetric tail associations? (Q2) If so,
what are the causes of these patterns? We examine potential ecological consequences of
asymmetric tail associations in the \nameref{Discussion}. We regard our investigation
as a first step toward a better understanding of the potential importance of
asymmetric tail associations for such central ecological topics as synchrony and
compensatory dynamics in communities and their influence on community stability;
and the match-mismatch hypothesis in phenology. The \nameref{Discussion} also has
additional thoughts on next steps toward this goal.
Our results and the conceptual considerations
introduced above are good evidence, in our view, of the potential for tail
association to make a crucial difference in how ecologists understand these
important topics.
```{r read_res,echo=F}
res_ff<-readRDS("./Results/aphid_results/ff_npa_stat_results/cor_npa_diff_ff_ln_all.RDS")
res_cer<-readRDS("./Results/plankton_results/npa_stat_results/cor_npa_diff_plankton_ln_all.RDS")
```
# Methods\label{M&M}
## Data\label{Data}
Our population dataset comprised average annual
abundance estimates for 15 locations (Fig. \ref{SM-fig_plankton_map})
in the North Sea and British seas for 4 species from the *Ceratium*
genus of dinoflagellates, and for the generalist consumer copepod species
*C. finmarchicus*, for the 56 years 1958 to 2013.
These data were a subset of a larger dataset covering 22 taxa
and 26 locations, analyzed by @sheppard2017, @sheppard2019, and @Ghosh_copula.
The locations are $2^\circ$ by $2^\circ$ grid cells.
The data were originally obtained from the
Continuous Plankton Recorder (CPR) dataset,
now operated and maintained by
the Marine Biological Association of the United Kingdom.
Data preprocessing steps were the same as used by @Ghosh_copula.
*Ceratium* species were extracted in part because they
have a role in harmful algal blooms (red tides) [@Baek2009]; and also because four
species were available from the genus (Table \ref{tab_plankton_aphid_info}),
and we chose closely related
species because they may
be influenced in similar ways
by environmental variables. The 15 locations we used were
selected from the 26 locations of the larger dataset
(Fig. \ref{SM-fig_plankton_map}) as follows.
First, to reduce the effects of sampling variation on statistical results,
we chose the subset of locations for which more than 35 years
of data were available for all species.
Second, for a given location, we excluded *Ceratium* species
that were undetected for more than 10$\%$ of sampled years at that location.
Finally, we considered only those locations for which at least two
*Ceratium* species remained. We also had data on average growing season
sea surface temperature for each grid cell and year
[@sheppard2017; @sheppard2019]. Earlier analyses [e.g., @sheppard2019]
demonstrated that sea surface temperature and *C. finmarchicus*
abundance are important covariates of phytoplankton dynamics
in UK seas,
though associations between temperature and phytoplankton are
probably due to relationships both these variables have with nutrient
abundance in surface ocean layers. Sea surface
temperature data preprocessing was the same as used by @sheppard2017.
Our phenology dataset comprised annual first flight dates
for 20 aphid species (Table \ref{tab_plankton_aphid_info})
from 10 locations across the UK (Fig. \ref{SM-fig_aphid_map}),
spanning the 35 years 1976 to 2010. These data were a subset of a larger
dataset covering 11 locations, analyzed previously by @sheppard2016
and @Ghosh_copula. The data were originally obtained from the
Rothamsted Insect Survey suction-trap dataset
[@harrington2014; @bell2015]. Data preprocessing
was the same as that of @sheppard2016.
Locations were screened, leading to the removal of one of the original
11 sampling locations, by requiring at least 30 years of data be
available for all species, again to reduce sampling variation of
statistics. We also had time series of winter average
temperature for each location and year. The winter temperature for
year $t$ was the average of December of year $t-1$ to March of year $t$.
Earlier analyses have demonstrated the importance of winter temperature
for aphid first flight date [e.g., @sheppard2016].
<!--\pagebreak
\begin{centering}
\includegraphics[width=10cm]{./Results/pedagog_fig.pdf}
\captionsetup{parbox=none}
\captionof{figure}[short caption]{Pedagogical figure for introducing tail association and partial Spearman correlation.
(A, B) Two pairs of variables that have identical Pearson (P) correlation, and also identical Spearman (S)
correlation,
but that differ markedly in the nature of the association. Panel A shows stronger left- than right-tail association
and panel B shows the reverse. (C, D) Normalized rank plots
(see \nameref{Introduction}) for panels A and B, respectively.
(E, F) Graphics supporting the definitions of partial Spearman correlation and our statistic measuring
asymmetry of tail association (see \nameref{M&M}). This figure is similar in some respects to Figs 1 and 7 of Ghosh \emph{et al} (2020).}\label{pedagogfig}
\end{centering}-->
<!--Table with species info -->
```{r tab_plankton_aphid_info, echo=F, results='asis',message=F}
library(tinytex)
library(tibble)
library(kableExtra)
library(dplyr)
#aphid_info_org <- tibble(c0=c(1:20),
# c1=c("Apple grass aphid","Bird cherry oat aphid","Black bean aphid","Blackberry cereal aphid",
# "Blackcurrant sowthistle aphid",
# "Corn leaf aphid","Currant lettuce aphid","Damson hop aphid","Grain aphid","Green spruce aphid",
# "Leaf-curling plum aphid","Mealy cabbage aphid","Mealy plum aphid","Pea aphid","Peach potato aphid",
# "Potato aphid","Rose grain aphid","Shallot aphid","Sycamore aphid","Willow carrot aphid"),
# c2=c("Rhopalosiphum insertum", "Rhopalosiphum padi", "Aphis fabae", "Sitobion fragariae", "Hyperomyzus lactucae",
# "Rhopalosiphum maidis", "Nasonovia ribisnigri","Phorodon humuli","Sitobion avenae","Elatobium abietinum",
# "Brachycaudus helichrysi","Brevicoryne brassicae","Hyalopterus pruni","Acyrthosiphon pisum","Myzus persicae",
# "Macrosiphum euphorbiae","Metopolophium dirhodum","Myzus ascalonicus","Drepanosiphum platanoidis","Cavariella #aegopodii")
#)
plankton_aphid_info <- tibble(c0=c(1:4,1:20),
c2=c("Ceratium fusus","Ceratium furca","Ceratium tripos","Ceratium macroceros",
"Rhopalosiphum insertum", "Rhopalosiphum padi", "Aphis fabae", "Sitobion fragariae",
"Hyperomyzus lactucae", "Rhopalosiphum maidis", "Nasonovia ribisnigri","Phorodon humuli",
"Sitobion avenae","Elatobium abietinum", "Brachycaudus helichrysi","Brevicoryne brassicae",
"Hyalopterus pruni","Acyrthosiphon pisum","Myzus persicae", "Macrosiphum euphorbiae",
"Metopolophium dirhodum","Myzus ascalonicus","Drepanosiphum platanoidis","Cavariella aegopodii" ))
knitr::kable(plankton_aphid_info, "latex", booktabs = T, linesep = "\\addlinespace",align="c",
caption = "Names of 4 plankton and 20 aphid species for which data were used. \\label{tab_plankton_aphid_info}",
col.names = NULL)%>%
group_rows("Plankton", 1,4) %>%
group_rows("Aphids", 5,24) %>%
column_spec(column = 2,italic=T)%>%add_header_above(header=c("Species ID" = 1, "Latin binomial" = 1))
```
## Statistical methods\label{Methods}
Given bivariate data $(x_t,y_t)$ for a set of years, $t$, of size $T$,
and after computing
normalized ranks $(u_t,v_t)$ as described in the \nameref{Introduction}, tail
association and asymmetry of tail association were measured using the
*partial Spearman correlation* of @Ghosh_copula, which we here reintroduce.
The standard Spearman correlation itself measures association between the variables
$x_t$ and $y_t$ (or between $u_t$ and $v_t$ - recall the Spearman correlation
is based on ranks, so is the same for both sets of variables);
but Spearman correlation measures only the overall
association of the samples and cannot tell us how association varies across the distributions
of the variables.
Given two bounds $1\leq l_b < u_b \leq 1$, we define the boundary lines
$u+v=2l_b$ and $u+v=2u_b$ (Fig. \ref{pedagogfig}E),
which intersect the unit square on which
normalized ranks are plotted.<!--DAN MODIFIED:--> The partial Spearman correlation associated
with the bounds $l_b$ and $u_b$ will be the portion of the Spearman correlation attributable
to the points that fall between these boundary lines.<!--END CHANGES-->
The partial Spearman correlation for the
band between these boundaries and within the unit square is
\begin{equation}\label{eq.Cor}
\cor_{l_b,u_b}(u,v) = \frac{\sum
(u_t-\mean(u)) (v_t-\mean(v))}{(T-1)\sqrt{\var(u)\var(v)}}.
\end{equation}
\noindent Here, sample means and sample variances are computed using all $T$ data points,
but the sum, $\Sigma$, is over only the indices $t$ for which $u_t+v_t > 2l_b$ and
$u_t+v_t < 2u_b$. The partial Spearman correlation is not defined if there
are no points in the band. For positively associated
$(u_t,v_t)$, the partial Spearman correlations
$\cor_{0,b}$ and $\cor_{1-b,1}$ for $b\leq 0.5$ (Fig. \ref{pedagogfig}F)
measure association in the left and right tails, respectively, and can be compared
via a difference, $\cor_{0,b}-\cor_{1-b,1}$, to measure asymmetry of tail
association. Positive values (respectively, negative) of this difference mean stronger left-tail
(respectively, right-tail) association.
The sum of $\cor_{0,0.5}$ and
$\cor_{0.5,1}$ (or the sum of
$\cor_{l_{b_k},u_{b_k}}$ for any other choice of bands
$(l_{b_k},u_{b_k})$ that partition $(0,1)$) equals the standard
Spearman correlation, as long as no points happen to lie exactly
on the bounds.
<!--DAN CHANGED:-->
Notation is summarized in Table \ref{SM-tab_notation}.
<!--END OF NEW TEXT-->
For each sampling location, $n$, we computed a matrix, $C^n$, which we call the
*community tail association matrix*, which quantifies
asymmetry of tail association between pairs of aphid species or pairs of *Ceratium*
species at $n$. Denote by
$s_i^n(t)$ the aphid first flight date or the *Ceratium* population density
for sampling location $n$, for the $i^{\text{th}}$ species that was present in the
cleaned data for location $n$, and for year $t$.
We then defined the matrix $C^n$
by defining $C^n(i,j)$ for two aphid or *Ceratium*
species $i,j$, as follows. <!--DAN MODIFIED THE FOLLOWING TEXT-->First, $C^n(i,j)$ was not defined, or was defined to
equal the missing-data space holder "NA", if one of three conditions held true: A) $i=j$; or if B) the hypothesis
that $s_i^n(t)$ and $s_j^n(t)$ were independent could not be rejected
($5\%$ level, using a test described by @Genest2007, implemented in the
function `BiCopIndTest` in the `VineCopula` package in R);
or if C) independence was rejected but the Spearman correlation
of $s_i^n(t)$ and $s_j^n(t)$ was negative.<!--END CHANGES--> Otherwise we defined
$C^n(i,j)=\cor_{0,b}(s_i^n(t),s_j^n(t))-\cor_{1-b,1}(s_i^n(t),s_j^n(t))$,
where the partial Spearman correlations in this expression were computed
over the times, $t$, for which data were available for location $n$.
The entry $C^n(i,j)$ was set to NA if independence of
$s_i^n(t)$ and $s_j^n(t)$
could not be rejected because attempting to quantify
tail association (or anything else about association) for independent
variables is pointless. $C^n(i,j)$ was set to NA for negatively associated
$s_i^n(t)$ and $s_j^n(t)$ because negative association occurred for only
one pair of species in one location
in our data (plankton sampling location 18, species *C. furca* and
*C. macroceros*, see \nameref{Results}).
Tail association
for negatively associated variables should be studied, and this
topic is revisited in the \nameref{Discussion},
but negative associations were too rare in our data to study them.
The community tail association matrix $C^n$ is symmetric. The
value $b=1/3$ was used for plankton locations, whereas $b=1/2$
was used for aphid locations because aphid time series were shorter, and
larger $b$ reduces sampling variation for our statistics [@Ghosh_copula].
See Appendix \ref{SM-boundb} for more information on the choice of $b$.
We also computed a matrix $D^n$, which we call the *community-driver tail association matrix*,
which quantifies tail association between
aphid or plankton time series and their covariates.
Denote by $d_k^n(t)$ the value of the $k^{\text{th}}$ covariate
that operated at sampling location $n$ in year $t$ (winter temperature for an
aphid sampling location, sea surface temperature or *C. finmarchicus* density
for a *Ceratium* location). We then defined $D^n$ by defining
$D^n(i,k)$ for an aphid or *Ceratium* species $i$ and a covariate $k$,
as follows. <!--DAN MODIFIED THE FOLLOWING-->First, $D^n(i,k)$ was not defined,
or was set to NA, if
the hypothesis that $s_i^n(t)$ and $d_k^n(t)$ were independent could not
be rejected ($5\%$ level, `BiCopIndTest`). Otherwise, we either: A)
set $D^n(i,k) = \cor_{0,b}(s_i^n(t),d_k^n(t))-\cor_{1-b,1}(s_i^n(t),d_k^n(t))$
if $s_i^n(t)$ and $d_k^n(t)$ were positively associated (positive Spearman correlation);
or B) set $D^n(i,k) = \cor_{0,b}(s_i^n(t),-d_k^n(t))-\cor_{1-b,1}(s_i^n(t),-d_k^n(t))$
if $s_i^n(t)$ and $d_k^n(t)$ were negatively associated (negative Spearman
correlation).<!--END CHANGES-->
For aphid first-flight time series, for which $k$ was always $1$ and
$d_k^n(t)$ was winter temperature in location $n$, associations between
$s_i^n(t)$ and $d_k^n(t)$ were always negative when they were significant
(see \nameref{Results}). The same was true for *Ceratium* density time series and
sea surface temperature.
Thus our practice of using $-d_k^n(t)$ was
equivalent, in the case of temperature variables, to using a "coldness"
index such as the number of degrees colder than an average
or typical reference temperature, in place of temperature. Aphid and *Ceratium* data
were always positively associated with the coldness index when they were
significantly associated with it. Although *C. finmarchicus* abundance was
positively associated with *Ceratium* time series in some sampling locations
and negatively associated in others, it always showed the same sign of association
with all *Ceratium* species within a location.
Using $-d_k^n(t)$ in place
of $d_k^n(t)$ when negative associations with aphid or *Ceratium* data occurred
allowed us to study asymmetry of tail association using methods developed with
positively associated variables in mind.
<!--***DAN: Lawrence got confused by the below text. I figured it was not essential here so removed it, storing
here for the time being, just in case:
Note that using a similar procedure to
study pairs of negatively associated biological variables (e.g., two aphid or plankton
time series) would be inappropriate because in that case there is no
canonical choice of which variable to take the negative of.-->
We again used
$b=1/3$ for plankton data and covariates, and $b=1/2$ for aphid data and winter
temperature. For display, we horizontally concatenated the matrices
$C^n$ and $D^n$ and displayed matrix values using color.
We used the community tail association matrix $C^n$ for each sampling location $n$ to answer
Q1 from the \nameref{Introduction}, as follows. First, we counted the number,
$N_L^n$, of entries of $C^n$ which were not NA and which were greater than
$0$. These were the "left-tail dominant" species pairs, i.e., pairs of species
for which association was stronger in the left rather than in the right tails of the species distributions.
We also counted the number, $N_R^n$, of right-tail dominant pairs, for which
the corresponding entries of $C^n$ were negative. If $N_L^n$ was substantially
greater than (respectively, substantially less than) $N_R^n$ for a location $n$,
it suggested that left-tail association (respectively, right-tail association)
between species in that location was dominant, answering Q1 in the affirmative.
We also calculated $A_{C,L}^n$, the
sum of all positive, non-NA entries of $C^n$; $A_{C,R}^n$, the sum
of all negative, non-NA entries of $C^n$; and $A_C^n=A_{C,L}^n+A_{C,R}^n$, a
general measure of asymmetry of tail association in location $n$.
We refer to $A_C^n$ as the *total community tail association*.
We additionally calculated
the normalized quantities $F_{C,L}^n=A_{C,L}^n/(A_{C,L}^n+|A_{C,R}^n|)$
and $F_{C,R}^n=A_{C,R}^n/(A_{C,L}^n+|A_{C,R}^n|)$.
Because $0\leq F_{C,L}^n \leq 1$,
$0 \leq |F_{C,R}^n| \leq 1$, and $F_{C,L}^n+|F_{C,R}^n|=1$, the relative sizes of
$F_{C,L}^n$ and $|F_{C,R}^n|$ indicate the relative dominance of left- and right-tail
association between species at location $n$. Together, all these statistics provide an answer to Q1.
We used the community tail association matrix, $C^n$, and the community-driver
tail association matrix, $D^n$, to answer Q2 from the \nameref{Introduction}
for the *Ceratium* and aphid data, as follows. First, we calculated
$A_D^n$, the sum of all non-NA entries of $D^n$. This was analogous to
$A_C^n$, but calculated using the matrix $D^n$ instead of the matrix
$C^n$. We refer to $A_D^n$ as the *total community-driver tail association*.
We then examined whether the values $A_C^n$ and $A_D^n$ were
correlated across locations, $n$.
This tests the causal hypothesis in the \nameref{Introduction} because it
tests whether *Ceratium* or aphid time series having
stronger right-tail (respectively, left-tail) association
with environmental covariates in a given location also had
stronger right-tail (respectively, left-tail) association
with each other at that location.
Recall that an environmental covariate was reversed (its negative was used)
when it was negatively associated with a *Ceratium* or aphid species, and that
no covariate was ever significantly positively associated with some
*Ceratium* or aphid species and significantly negatively associated with another
such species in the same location (see \nameref{Results}).
We also answered Q2 for the aphid data as follows. Within a location, $n$, for each species,
$i$, we computed the mean $\alpha^n_C(i)$ of all non-NA entries $C^n(i,j)$,
for $j$ ranging across all species for which we had data. This quantity measures an
average tail association of species $i$ with other species in the
same location, with positive values for greater left-tail association and
negative ones for greater right-tail association. We refer to
$\alpha^n_C(i)$ as the *species-community tail association* for species $i$.
We then defined
$\alpha^n_D(i)$ as the sum of all non-NA entries $D^n(i,k)$, for $k$
ranging across all covariates for which we had data. We refer to this
as the *species-driver tail association* for species $i$. For
aphids we only had one covariate, winter temperature, so $\alpha^n_D(i)=D^n(i,k)$
for $k=1$ corresponding to winter temperature. We provide the more general
definition of $\alpha^n_D(i)$ that applies when more
covariates were available so the definition can also be considered
(briefly, see below) for *Ceratium* data. We then examined, for each location, $n$, whether
$\alpha^n_C(i)$ and $\alpha^n_D(i)$ were correlated across species, $i$.
This tests the causal hypothesis in the \nameref{Introduction} because it
tests whether aphid species which were more right-tail (respectively, left-tail)
associated with environmental covariates (winter temperature) also had
time series that were more right-tail (respectively, left-tail) associated with
the time series of other species in the location. Recall that
winter temperature was always negatively associated with aphid first flight
when it was significantly associated (see \nameref{Results}),
and negative temperature (a coldness index) was used in computing $D^n(i,k)$.
Testing whether $\alpha^n_C(i)$ and $\alpha^n_D(i)$ were correlated across
species, $i$, within a location, $n$, was not practical for *Ceratium*,
because we only had data for at most four *Ceratium* species per sampling
location, an insufficient number to provide much statistical power
in testing for a correlation.
# Results\label{Results}
<!--Results for Q1-->
<!--For Ceratium-->
Associations between *Ceratium* species were always positive when they were significant,
except for one pair of species in one location (plankton sampling location 18, species *C. furca* and
*C. macroceros*).
Asymmetric tail association was very common between *Ceratium* population density
time series from the same location, answering Q1 in the affirmative for *Ceratium*;
for some locations, left-tail association between *Ceratium*
species was dominant, and for other locations right-tail association was dominant.
To show this, we show that for some locations, the community tail association
matrix, $C^n$, was comprised largely of positive values,
indicating a preponderance of left-tail association between *Ceratium* time
series for the location (Fig. \ref{fig_CorlmCoru_plankton_map_loc12_26}A).
For such locations, *Ceratium* population densities are more likely to be correlated
across species at low population densities than at high densities.
For other locations, $C^n$ had mostly negative values, indicating a
preponderance of right-tail association (Fig. \ref{fig_CorlmCoru_plankton_map_loc12_26}B).
For such locations, *Ceratium* population densities are more likely to be correlated
across species at high population densities than at low densities.
To demonstrate the same result in another way, we show that the
statistics $F_{C,L}$ and $F_{C,R}$, plotted across all sampling locations
(Fig. \ref{fig_CorlmCoru_plankton_map_loc12_26}C), indicated that most *Ceratium*
sampling locations were dominated by either left- or right-tail association, with
approximately equal numbers of each, with only a few locations having
more symmetric tail association, on average across pairs of *Ceratium* species.
<!--For aphids-->
Associations between aphid time series were always positive
when they were significant.
Asymmetric tail association was also very common between aphid first flight
time series from the same location, answering Q1 in the affirmative for aphids;
left-tail association was more common for some sampling locations
and right-tail association dominated for others, but for most sites right-tail
association dominated.
To show this, we show that for some locations, the community
tail association matrix, $C^n$, was
comprised of a slight majority of positive values,
indicating more left- than right-tail association between aphid time
series for the location (Fig. \ref{fig_CorlmCoru_ff_map_loc2_5}A);
whereas for other locations, $C^n$ had mostly negative values, indicating a
preponderance of right-tail association (Fig. \ref{fig_CorlmCoru_ff_map_loc2_5}B).
To demonstrate the same result in another way, we show that the statistics
$F_{C,L}$ and $F_{C,R}$, plotted across all sampling locations
(Fig. \ref{fig_CorlmCoru_ff_map_loc2_5}C), indicated that most aphid sampling
locations had a preponderance of right-tail association, with only a few locations
having more left-tail association, and those only slightly more. Thus, for most locations,
aphid first flights are more correlated across species when first flights are
later than average.
<!--\pagebreak
\begin{centering}
\textbf{ \hspace{1 cm} (A) \hspace{7 cm} (B)} \\
\includegraphics[width=7.5 cm]{./Results/plankton_results/npa_stat_results/loc12/loc12_Corl-Coru.pdf}
\includegraphics[width=7.5 cm]{./Results/plankton_results/npa_stat_results/loc26/loc26_Corl-Coru.pdf}\\
\textbf{ (C)} \\
\includegraphics[width=6.5 cm]{./Results/plankton_results/npa_stat_results/Corstat_LmU_values_on_map_sp_only.pdf}
\captionsetup{parbox=none}
\captionof{figure}[short caption]{Either right- or left-tail association between population density time series of
\emph{Ceratium} species could dominate, depending on the sampling location. (A, B) The community
tail association matrix, $C^n$, and the community-driver tail association matrix, $D^n$ (\nameref{Methods}),
horizontally concatenated, for example locations $n=12$ (A) and $n=26$ (B).
See Table \ref{tab_plankton_aphid_info} for species names.
All the non-NA values in $C^n$ were positive (red) for location $12$ (A), indicating left-tail
association dominated in that location; but values were largely negative (blue) for location
$26$ (B), indicating right-tail association dominated there. Matrix entries which were NA
because time series were independent are displayed in yellow. The counts $N^n_L$ and $N_n^R$
(see \nameref{Methods}) also reflect the distinct tail association characteristics of the two
locations. \emph{C. fin.} = \emph{C. finmarchicus}; Temp. = temperature.
Green dots in $D^n$
represent variables which were originally negatively associated, so the negative
of the environmental covariate was used for calculating tail association.
See Fig. \ref{SM-fig_CorlmCoru_plankton_all_loc}
for analogous figures for the other sampling locations.
(C) The summary statistics $F_{C,L}$ and $F_{C,R}$ (see \nameref{Methods})
for each site show that association between \emph{Ceratium} species was either
substantially dominated by the left or right tails of \emph{Ceratium} distributions, with the
exceptions of a few locations for which tail association was closer to symmetric. Site codes
are colored red or blue depending on which of $F_{C,L}$ or $F_{C,R}$ had higher magnitude. Values are
not plotted for site 3 because the hypothesis could not be rejected for that site
that dynamics of distinct \emph{Ceratium} species were independent.
\label{fig_CorlmCoru_plankton_map_loc12_26}}
\end{centering}-->
<!--
\begin{center}
\rule{0.3\textwidth}{\textheight}
\captionof{figure}{text on the next page}
\end{center}-->
<!--Results for Q2-->
<!--Para for Ceratium, using A_C^n and A_D^n across n.-->
For the *Ceratium* data, the total community tail association, $A_C^n$, and the
total community-driver tail association, $A_D^n$, were significantly correlated
across locations, $n$, validating our
hypothesis from the \nameref{Introduction} for a cause of tail association between co-located
species, and helping to answer Q2. In other words, tail association between co-located
species time series was apparently inherited from common tail association of the species
on environmental drivers.
Across our `r length(res_cer$CorlmCoru_all_ln_list)` locations,
$A_C^n$ and $A_D^n$ were significantly positively correlated (Pearson correlation,
two-tailed test, Fig. \ref{fig_plankton_scatter}A). Thus locations for which
*Ceratium* density time series showed greater left-tail (respectively, right-tail)
association with environmental covariates (measured with $A_D^n$)
also exhibited greater left-tail (respectively, right-tail) association between
density time series for distinct species (measured with $A_C^n$).
<!--Para for aphids, using A_C^n and A_D^n across n.-->
For the aphid data, the total community tail association, $A_C^n$, and the total community-driver
tail association, $A_D^n$, were positively but non-significantly correlated
across our `r length(res_ff$CorlmCoru_all_ln_list)` sampling locations
(Fig. \ref{fig_plankton_scatter}B). Thus locations for which aphid first-flight time series
showed greater left-tail (respectively, right-tail) association with winter temperature
also showed a non-significant tendency toward greater left-tail (respectively, right-tail) association between
the time series of distinct species.
<!--***DAN CHANGED: The correlation may have been non-significant for the aphid data
-simply because there were slightly fewer aphid sampling locations than there were
-plankton locations.
-Nevertheless, when combined with the plankton result, this aphid result tends to support
-the hypothesis that tail association between co-located species time series can be inherited from
-common tail association on environmental drivers. See also the subsequent results for aphids.-->
The correlation was close to significant for the aphid data, and
may have been non-significant simply because there were slightly fewer aphid sampling locations than there were
plankton locations. See also the subsequent results for aphids, which were significant and which support
the same overall conclusions.
<!--END OF NEW TEXT-->
<!--\pagebreak
\begin{centering}
\textbf{ (A) \hspace{7 cm} (B)} \\
\includegraphics[width=7.5 cm]{./Results/aphid_results/ff_npa_stat_results/loc2/loc2_Corl-Coru.pdf}
\includegraphics[width=7.5 cm]{./Results/aphid_results/ff_npa_stat_results/loc5/loc5_Corl-Coru.pdf} \\
\textbf{ (C)} \\
\includegraphics[width=7.5 cm]{./Results/aphid_results/ff_npa_stat_results/Corstat_LmU_values_on_map_sp_only.pdf}
\captionsetup{parbox=none}
\captionof{figure}[short caption]{Either right-tail association between first-flight time series of
aphid species could dominate, or left-tail association could be more common,
depending on the sampling location. (A, B) The community tail association matrix, $C^n$,
and the community-driver tail association matrix, $D^n$ (\nameref{Methods}), horizontally
concatenated, for example locations $n=2$ (A) and $n=5$ (B).
See Table \ref{tab_plankton_aphid_info} for species names.
A slight majority of non-NA values in $C^n$ were positive (red) for location $2$ (A; see the
$N_L^n$ and $N_R^n$ counts displayed), indicating left-tail
association was slightly more common than right-tail association in that location. But values were largely
negative (blue) for location
$5$ (B), indicating right-tail association dominated there. Matrix entries which were NA
because time series were independent are displayed in yellow. Temp. = temperature.
Green dots in $D^n$
represent variables which were originally negatively associated, so the negative
of winter temperature was used for calculating tail association (\nameref{Methods}); this happened in all cases
for which temperature and first flight were significantly associated.
See Fig. \ref{SM-fig_CorlmCoru_ff_all_loc}
for analogous figures for the other sampling locations.
(C) The summary statistics $F_{C,L}$ and $F_{C,R}$ (see \nameref{Methods})
for each site show that association was either
dominated by the right tails, or, for a few locations,
showed slightly more left-tail association. Site codes
are colored red or blue depending on which of $F_{C,L}$ or $F_{C,R}$ had higher magnitude.}
\label{fig_CorlmCoru_ff_map_loc2_5}
\end{centering}-->
<!--Para for aphids using the second method.-->
Our second analysis using aphids, based on the species-community tail associations, $\alpha_C^n(i)$,
and the species-driver tail associations, $\alpha_D^n(i)$ (\nameref{Methods}),
provided further evidence supporting our hypothesis for a cause of tail association between co-located
species (\nameref{Introduction}). For 8 of 10 sampling locations, $\alpha_C^n(i)$ and $\alpha_D^n(i)$
were significantly correlated across species, $i$ (Fig. \ref{fig_aphid_multipanel}).
In other words, for 8 of 10 locations, aphid species with greater left-tail (respectively,
right-tail) association with winter temperature also had greater left-tail (respectively,
right-tail) association with other aphid species.
# Discussion\label{Discussion}
<!--P: summary of answers to Qs-->
Our results show that synchronous population density or phenological time series
of co-located species can very commonly show asymmetric tail association. For some sampling locations
and species, tail association was predominantly in the left tails, and for others it was predominantly
in the right tails of time series distributions, showing a new kind of ecologically
meaningful variation among ecosystems. The partial Spearman correlation presented
by @Ghosh_copula is a simple and effective way to measure tail association for ecological applications.
Our results also demonstrate a mechanism by which asymmetric tail association between species
can arise: it can
be inherited by joint tail association of the two species on the same environmental variables.
This mechanism seems likely to apply commonly when co-located species are influenced by the
same external factors. Our results convincingly show that standard correlation approaches
omit phenomena that seem likely to be important for at least two major topics of interest in ecology:
synchronous/compensatory dynamics of species within a community and their influence on
community stability; and shifting phenologies and the match-mismatch hypothesis.
<!--P: Interpretations and consequences of patterns for Ceratium-->
The distinct tail association characteristics of *Ceratium* in different sampling areas
around the UK may have consequences for the stability through time of total *Ceratium* abundance,
which may relate to harmful algal blooms because *Ceratium* species can have a role in such blooms [@Baek2009].
For locations in which left-tail association between *Ceratium* density time series
is dominant, *Ceratium* species are scarce simultaneously,
potentially producing years of very low total *Ceratium* biomass.
In contrast, for locations in which right-tail association is dominant, *Ceratium*
species are highly abundant simultaneously, which may produce
years of very high *Ceratium* biomass, which may sometimes correspond to harmful algal blooms.
Our results show that the distinction between these two types of location relates to the
tail association of *Ceratium* species with their environmental covariates,
sea surface temperature and *C. finmarchicus* density. It may be useful to study
in future work why some locations principally have left-tail association with these
drivers and some principally have right-tail association.
<!--P: same for aphids-->
First-flight time series for populations of co-located aphid species were principally
right-tail associated, i.e., more strongly correlated when first flights were later in the
season. Our results show this was probably because: cold winters delay aphid first flights,
but warm winters do not lead to first flights that are any earlier, on average, than those following moderate winters,
producing right-tail association between first flights and winter coldness across multiple species;
this common association leads to right-tail association between aphids. Thus winter temperature
fluctuations lead to temporally dispersed early but temporally coordinated late arrival times of aphid
species on summer hosts (many of which are crops, for the species we studied), a fact that may have pest-control
significance. Winter temperature is known to
influence the first-flight dates of virtually all the aphid species for which we had data
[@sheppard2016]. Overwintering aphids are sensitive to frost conditions, and so winters
probably reduce early spring populations on winter hosts plants. This then lengthens the time
required for populations to reach sufficient densities to stimulate the production of winged morphs
for flight to summer host plants.
<!--P: 4) consequences of tail dependence in spatial synchrony as studied in BIVAN (skewness stuff), links to similar idea for community dynamics, and that is a reason what we have done is important-->
If $x_{s,l}(t)$ denotes the population density of species $s$ ($s=1,\ldots,S$) in location $l$ ($l=1,\ldots,L$)
at time $t$, we have here studied the nature and causes of tail association among the time
series $x_{s,l}(t)$ for a fixed $l$ and for $s=1,\ldots,S$; whereas @Ghosh_copula studied
the nature, causes and consequences of tail association among the time series
$x_{s,l}(t)$ for fixed $s$ and $l=1,\ldots,L$, a distinct ecological context. One
of the consequences studied by @Ghosh_copula
relates to and illuminates a potential consequence, mentioned above, of tail association for the ecological
context of this study. @Ghosh_copula showed that the skewness, though time, of the
spatial-total time series $\sum_l x_{s,l}(t)$ is sensitive to the nature of tail association between
the $x_{s,l}(t)$ ($l=1,\ldots,L$), if these time series are positively associated with each other. Right-tail
(respectively, left-tail) association tended to produce right (respectively, left) skew in the total.
Right skew corresponds to a spatial-total time series with exceptionally large values,
i.e., to "spiky", unstable dynamics of the total population. Left skew
corresponds to a spatial-total time series with low values, i.e., to dynamics of the total
population with a tendency to "crash". The total population can
be regarded as a landscape-level measure of the stability or variability of species
$s$, and is important, for instance, if species $s$ is a pest or an exploited species.
For the same reasons, the skewness, through time, of the community-total time series
$\sum_s x_{s,l}(t)$ is sensitive to the tail association between the $x_{s,l}(t)$ ($s=1,\ldots,S$),
which we have here studied. Right-tail
(respectively, left-tail) association again tends to produce right (respectively, left) skew in the total
time series.
In this community context, the total is an aggregate property of the community, and the variability of
this total has been used in an extensive literature [e.g., @hallett2014] to characterize community stability
through time. This literature has explored the effects of synchronous versus compensatory
dynamics in the $x_{s,l}(t)$ ($s=1,\ldots,S$) on the stability of the total community time series,
$\sum_s x_{s,l}(t)$. But our results show that, even if all the species time series
$x_{s,l}(t)$ ($s=1,\ldots,S$) are synchronous with each other, the tail association properties
of these time series can influence the stability of the community-total time series.
Although our results are sufficient to show that tail associations are
likely to be important for studies of community dynamics and stability, many
communities show not only synchronous dynamics between some species pairs
$x_{s_i,l}(t)$ and $x_{s_j,l}(t)$, but also compensatory dynamics between other pairs.
Our *Ceratium* time series were almost entirely synchronous, so we could not study
the importance of tail association for compensatory dynamics.
Next research steps should include the study of tail association between compensatory species
within a local community. Furthermore, *Ceratium* are only part of the
phytoplankton community in UK seas. It may be advantageous for future work to use
data characterizing an entire competitive community. For instance, the data of @hallett2014
constitute annual abundances of all species of plant in an area. In that dataset,
some species pairs show synchronous and some show compensatory dynamics.
<!--P: 7) para on how to deal with negatively associated variables in future work-->
Studying asymmetry of tail association for negatively correlated species density time
series will require slightly modified methods. The only negative association between aphid or *Ceratium* time series
that occurred in our system was not analyzed. Negative associations between species time series and
the environmental covariates we considered were handled
statistically by considering the positive association between the species time
series and a "reversed" covariate; this corresponds to a positive
association with a reconceptualized covariate, e.g., a "coldness" index.
But that approach would make no sense for negatively
associated time series of two aphid or *Ceratium* time series:
there is no canonical choice of which variable to reverse. Asymmetry of tail
association could still be considered, however, for negatively associated variables, $u,v$,
in an unsigned approach, via the index $|\cor_{0,b}(u,1-v)-\cor_{1-b,1}(u,1-v)|$. Because $|\cor_{0,b}(u,1-v)-\cor_{1-b,1}(u,1-v)|=|\cor_{0,b}(1-u,v)-\cor_{1-b,1}(1-u,v)|$,
no choice need be made on which variable to "reverse."
A large value of this index indicates that tail association between $u$ and $v$ is asymmetric,
though it does not provide information on whether association is stronger between
the left tail of $u$ and the right tail of $v$ or between the right tail of $u$
and the left tail of $v$.
<!--\begin{centering}
\textbf{ \hspace{1.2 cm} (A) \hspace{7 cm} (B)} \\
\includegraphics[width=8 cm]{./Results/plankton_results/npa_stat_results/Corstat_scatter_LmU_values.pdf}
\includegraphics[width=8 cm]{./Results/aphid_results/ff_npa_stat_results/Corstat_scatter_LmU_values.pdf}
\captionsetup{parbox=none}
\captionof{figure}[short caption]{Tail association with environmental covariates was positively related to tail association
between species for aphid and plankton time series. Panels show total community tail association, $A_C^n$,
plotted against total community-driver tail association, $A_D^n$ (\nameref{Methods}), across
locations, $n$, for \emph{Ceratium} density (A) and aphid first-flight (B) data. Pearson correlations
and associated $p$-values for each panel are in the headers. Points are labeled with location
numbers (see Figs \ref{SM-fig_plankton_map} and \ref{SM-fig_aphid_map}).\label{fig_plankton_scatter}}
\end{centering}-->
<!--what extra info do you get by monitoring tail dep. for plankton?-->
Measures of tail association may also reveal useful information about
freshwater plankton ecosystems and harmful algal blooms, in addition to information about
marine harmful algal blooms (discussed above).
Because blooms are extreme phenomena involving multiple species,
monitoring the associations of phytoplankton species with each other and
their associations with
temperature and nutrient data in the
extremes (this is tail association)
could help us to better understand harmful blooms. Considering tail association
may even produce improvements in statistics that have been
developed to serve as early warning signals of impending major changes<!--DAN CHANGED:-->
(so-called "tipping points")<!--END CHANGES--> in plankton
communities and the lakes they inhabit [@carpenter2011early; @butitta2017spatial],
since some established early warning statistics
make use of skewness of population distributions [@guttal2008changing]. Tail association between
phytoplankton species is related to skewness of the total phytoplankton biomass
time series,<!--DAN CHANGED: as described above--> as described in an earlier Discussion
paragraph.<!--END CHANGES-->
<!--\begin{centering}
\hspace{1 cm}
\includegraphics[width=16 cm]{./Results/aphid_results/ff_npa_stat_results/Corstat_LmU_avg_sp_temp_multipanel_plot_legend.pdf}
\includegraphics[width=16 cm]{./Results/aphid_results/ff_npa_stat_results/Corstat_LmU_avg_sp_temp_multipanel_plot.pdf}
\captionsetup{parbox=none}
\captionof{figure}[short caption]{For 8 out of 10 sites, the Pearson correlation ($P$) between the species-community tail association,
$\alpha_C^n(i)$, and the species-driver tail association, $\alpha_D^n(i)$, across
$i = 1, 2, \dots, 20$, was significantly positive (p $< 0.05$, one tailed test). This supports the
hypothesis that tail association between species may be inherited from joint tail
association of both species on a common environmental driver. See Table \ref{tab_plankton_aphid_info}
for species IDs.\label{fig_aphid_multipanel}}
\end{centering}-->
<!--P: 6) although our aphid results were sufficient to demo that tail dependence can be an important factor in the phenology of co-located species, and therefore *may* be an important factor for understanding shifting phenologies and their consequences, fuller application to match-mismatch will require future work using species which interact.
a) our aphid species have largely different hosts, so don't really interact
b) think about further commentary
c) When you present items 5 and 6 here, you will have to cause the reader to think back to your statement in the Intro that our results are just a first step toward the goals which were outlined there (i.e., copulas in community dynamics and phenology studies)-->
Although our aphid results were sufficient to demonstrate that tail association can be an important
factor in the phenology of co-located species, it will be necessary in future work to apply
tail association ideas to different datasets to assess whether these ideas
can improve our understanding of the consequences of changing phenology for trophic phenological matching.
The aphid species we studied have different host plants, so they do not directly interact. Shifts and
fluctuations in the phenology of one species probably do not directly influence
other species in our dataset. Future research should apply tail association to
time series of phenologies of interacting species, such as the data on tree budburst dates,