From 9297d5bb9e35ebd2754ff7b26be61840d4b8eead Mon Sep 17 00:00:00 2001
From: Tanya Strydom <tanya.strydom@icloud.com>
Date: Fri, 16 Feb 2024 13:05:55 +0000
Subject: [PATCH] =?UTF-8?q?=F0=9F=8F=96=EF=B8=8F=20sandbox=20commit?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

---
 _freeze/index/execute-results/html.json |  4 ++--
 index.qmd                               | 26 ++++++++++++-------------
 2 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/_freeze/index/execute-results/html.json b/_freeze/index/execute-results/html.json
index 183dc4d..df3cc75 100644
--- a/_freeze/index/execute-results/html.json
+++ b/_freeze/index/execute-results/html.json
@@ -1,8 +1,8 @@
 {
-  "hash": "cc45f1da55ee72fe6a280911e01f2249",
+  "hash": "9733c2e74e650a00d9453f7dd3cd87dc",
   "result": {
     "engine": "jupyter",
-    "markdown": "---\ntitle: T is for Topology\nauthor:\n  - name: Tanya Strydom\n    id: ts\n    orcid: 0000-0001-6067-1349\n    corresponding: true\n    email: t.strydom@sheffield.ac.uk\n    roles:\n      - Words\n      - Nonsense\n      - Rascality\n      - Visualisation (although absent)\n    affiliation:\n      - id: sheffield\n        name: University of Sheffield\n  - name: Andrew P. Beckerman\n    id: apb\n    orcid: 0000-0002-7859-8394\n    corresponding: false\n    roles: []\n    affiliations:\n      - ref: sheffield\nfunding: 'The author(s) received no specific funding for this work. Well they did I just haven''t done the '\nkeywords:\n  - food web\n  - network construction\nabstract: |\n  Pending...\nplain-language-summary: |\n  We want to know a bit more about the different network topology generators (predict tools) and how they differ - *i.e.,*  their strengths and weaknesses\ndate: last-modified\nbibliography: references.bib\ncitation:\n  container-title: Some fancy journal\nnumber-sections: true\n---\n\n:::{#f93d1691 .cell .markdown}\n## Introduction\n\nThe standard run of the mill that we cannot always feasibly construct networks because 1. hard, 2. time (yay dinosaurs, but also the future and impending doom I guess), and 3. probably something else meaningful that's just slipping my mind at the moment. Some of the usual culprits will come in here like: @jordanoSamplingNetworksEcological2016; [@jordanoChasingEcologicalInteractions2016]; @poisotGlobalKnowledgeGaps2021\n\n> TODO: standardise language between a topology generator and a topology predictor... I guess they can both be considered models but I'm not quite sure...\n\n### Philosophical contemplation when constructing interaction networks\n\n#### Why do we want to construct an interaction network?\n\nArguably the need for methods and tools for constructing interaction networks arises from two different (but still aligned) places of interest within the field of network ecology. On the one side of the spectrum sits the researcher who is interested in generating a set of ecologically plausible but not necessarily realised 'in the field' for the purpose of running further simulations (*e.g.,* extinction sim REF? **TODO**) or understanding some higher-level process (*e.g.,* energetics REF **TODO**). This researcher is contrasted by one that is interested in constructing real-world, location specific interactions networks in lieu of having access to data generated in the field (see @strydomRoadmapPredictingSpecies2021 for a discussion on this). Of course these two categories are not distinct, mutually exclusive, groups but can rather be viewed as operating on a gradient ranging from a need for generality (*something*) to a need for specificity (local-level predictions) when it comes to the quality(?) of the network that is constructed by a specific tool. Of course this research need would also reflected in the model development process itself and thus the idea of what a 'good enough' constructed network will be in the context of assessing the performance of a specific model. \n\n@cohenStochasticTheoryCommunity1985 states that *\"[Their] approach is more like gross anatomy than like physiology... that is, the gross anatomy is frozen, rather than in motion.\"*. \n\nInterestingly @williamsSuccessItsLimits2008 also explicitly talk about *structural* food-web models in their introduction... so how I see it that means that there has always been this inherent acknowledgement that models are function at a specific 'network level'.\n\n#### The history behind the approach\n\n\nMaybe a brief history of the development of predictive tools/topo generators? Sort of where the theory/body of work was based and how that has changed? IS there a difference between topo generator and predictive tool - I'm inclined to think that it aligns with the whole debate of high level structure vs node-level perfection\n\nMaybe start here with discussing the core mechanistic differences that models will work at --- some are really concerned about (and thus constrained by) structure, others are more mechanistic in nature *i.e.,* species *a* has the capacity to eat species *b* because traits (read gob size), and then you get @rohrModelingFoodWebs2010 and @strydomFoodWebReconstruction2022 that sit in the weird liminal latent space... \n\nHere I will probably get on my (newly discovered) soapbox and wax lyrical about how in certain situations structure is enough (and that will probably be for some high-level things like thinking about energy flows etc., I can also see a world in which maybe you want to do some sort of robustness/extinction work - since then you're usually doing 'random' (within limits) extinctions) but there may be use cases where we are really interested in the node-level interactions *i.e.,* species identity is like a thing we need to care about and also be able to retrieve specific interactions at specific nodes correctly. What is the purpose of generating a network? Is it an element of a bigger question we are asking, *e.g.,* I want to generate a series of networks to do some extinction simulations/bioenergetic stuff OR are we looking for a 'final product' network that is relevant to a specific location? (this can still be broad in geographic scope). \n\nAt some point we are going to need to discuss the key differences and implications between predicting a metaweb (*sensu* @dunneNetworkStructureFood2006) and a network realisation. And here I can't help but think about @poisotSpeciesWhyEcological2015 (and probably other papers) that discuss how the local factors are going to play a role and even the same pair of species may interact differently in different points in the landscape.\n\n> Do we need to delve into individual-based networks? (*sensu* Tinker 2012, Araújo 2008) I think its probably a step too far and one starts creeping into apples and pears type of comparisons. Especially since these work off of already existing networks (I seem to recall) and its more about about 'tweaking' those - so not so much *de novo* predictions. Although this might be useful to keep in mind when it comes to re-wiring... Also on that note do we opn the re-wiring door here in this ms or wait it out a bit.\n\n## Data & Methods {#sec-data-methods}\n\n### Overview of models\n\n#### Structural models\n\n**Random model** [@erdosRandomGraphs1959]: Links are assembled randomly, not developed within an ecological framework. But of course could still hold if we assume that communities are randomly assembled in terms of who is interacting with who (I seem to think that's sort of what May was arguing but I would need to remind myself)\n\n**Cascade model** [@cohenCommunityFoodWebs1990]: Much like the name suggests the cascade model rests on the idea that species feed on one another in a hierarchical manner. This rests on the assumption that the links within a network are variably distributed across the network; with the proportion of links decreasing as one moves up the trophic levels (*i.e.,* 'many' prey and 'few' predators). This is achieved by assigning all species a random rank, this rank will then determine both the predators and prey of that species. A species will have a particular probability of being fed on by any species with a higher ranking than it, this probability is constrained by the specified connectance of the network. Interestingly here 'species' are treated as any individual that consume and are consumed by the same 'species', *i.e.,* these are not taxonomical species [@cohenStochasticTheoryCommunity1985]. The original cascade model has altered to be more 'generalised' [@stoufferQuantitativePatternsStructure2005], which altered the probability distribution of the prey that could be consumed by a species. \n\n**Niche model** [@williamsSimpleRulesYield2000]: The niche model introduces the idea that species interactions are based on the 'feeding niche' of a species. Broadly, all species are randomly assigned a 'feeding niche' range and all species that fall in this range can be consumed by that species (thereby allowing for cannibalism). The niche of each species is randomly assigned and the range of each species' niche is (in part) constrained by the specified connectance of the network. The niche model has also been modified, although it appears that adding to the 'complexity' of the niche model does not improve on its ability to generate a more ecologically 'correct' network [@williamsSuccessItsLimits2008]. \n\n**Nested hierarchy model** [@cattinPhylogeneticConstraintsAdaptation2004]:\n\n#### Mechanistic models\n\n**Allometric diet breadth model (ADBM)** [@petcheySizeForagingFood2008]:\n\n**Log-ratio** [@rohrModelingFoodWebs2010]: Interestingly often used in paleo settings (at least that's what it currently looks like in my mind... (*e.g.,* @yeakelCollapseEcologicalNetwork2014, @piresMegafaunalExtinctionsHuman2020)\n\n**Stochastic** [@rossbergFoodWebsExperts2006]:\n\n**PFIM** [@shawFrameworkReconstructingAncient2024]:\n\n**Trait-based** [@caronAddressingEltonianShortfall2022]:\n\n**Graph embedding** [@strydomGraphEmbeddingTransfer2023]: *e.g.,* [@strydomFoodWebReconstruction2022]\n\nI know tables are awful but in this case they may make more sense. Also I don't think I'm at the point where I can say that the table is complete/comprehensive but it getting there Not sure about putting in some papers that have used the model - totes happy to drop those I think...\n\n| Model               | Core Mechanism | End product | Specificity      | Interaction |\n|---------------------|----------------|-------------|------------------|-------------|\n| random              | structural     | network     | species agnostic | binary      |\n| cascade             | structural     | network     | species agnostic | binary      |\n| niche               | structural     | network     | species agnostic | binary      |\n| nested hierarchical | structural     | network     | species agnostic | binary      |\n| ADBM                | mechanistic    |             | energetics       | quantitative|\n| log-ratio           | pondering...   |             |                  |             |\n| PFIM                | mechanistic    | metaweb     | trait based      |pondering... |\n| graph embedding     | embedding      | metaweb     | evolutionary     |probabilistic|\n| trait model         | mechanistic    | metaweb     | trait based      |             |\n| stochastic          |                |             |                  |             |\n\n: Lets make a table that gives an overview of the different topology generators that we will look at {#tbl-history}\n\n> Might be nice to have a little appendix/supp mat that breaks down the models in detail so that they are all in one place so that someone (grad student being told they need to build networks) some day can go and educate themselves with slightly lower effort. This will also be useful for me should I end up having to do some actual coding - think of this as step one in the pseudo code process.\n\n### Datasets used\n\nHere I think we need to span a variety of domains, at minimum aquatic and terrestrial but maybe there should be a 'scale' element as well *i.e.,* a regional and local network. I think there is going to be a 'turning point' where structural will take over from mechanistic in terms of performance. More specifically at local scales bioenergetic constraints (and co-occurrence) may play a bigger role in structuring a network whereas at the metaweb level then mechanistic may make more (since by default its about who can potentially interact and obviously not constrained by real-world scenarios) *sensu* @caronTrophicInteractionModels2023. Although having said that I feel that contradicts the idea of backbones (*sensu* Bramon Mora (sp?) et al & Stouffer et al) But that might be where we get the idea of core *structure* vs something like linkage density. So core things like trophic level/chain length will be conserved but connectance might not (I think I understand what I'm trying to say here)\n\nI think we should also use the Dunne (I think) Cambrian (also think) network (I was correct and its this one @dunneCompilationNetworkAnalyses2008). Because 1) it gives the paleo-centric methods their moment in the sun and 2) I think it also brings up the interesting question of can we use modern structure to predict past ones? Here one might expect a more mechanistic approach to shine.\n\nDraw the other datasets from `Mangal` because they will be nicely formatted and essentially at point and shoot level\n\n### Comparing different models\n\nFor now the (still essentially pending) workflow/associated code can be found at the following repository [BecksLab/topology_generators](https://github.com/BecksLab/topology_generators)\n\n1. Shortlist/finalise the different topo generators\n2. collate/translate into `Julia`\n    * *e.g.,* some models wil be in SpeciesInteractionNetworks.jl (new EcoNet); I know (parts of) the transfer learning stuff is and the niche model\n    * others will need to be coded out (the more simpler models should be easier)\n    * can also consider `R` but then it becomes a case of porting things left and right depending on how we decide to do the post analyses\n3. Curate networks for the different datasets/scenarios we select - I feel like there might be some scenarios that we can't do all models for all datasets but maybe I'm being a pessimist.\n    * Need to also think about where one might find the additional data for some of the models...\n        * Body size: @herbersteinAnimalTraitsCuratedAnimal2022 - Although maybe Andrew has strong thotsTM RE the one true body size database to rule them all...\n        * Other trait sources: @wilmanEltonTraitsSpecieslevelForaging2014 and @jonesPanTHERIASpecieslevelDatabase2009\n        * This is where we'll get the paleo traits from if I'm correct @bambachAutecologyFillingEcospace2007\n        * Phylogeny stuff: @uphamInferringMammalTree2019 (what we used for TL but its only mammals...) but I'm sure there will be others\n    * Also limitation of scope... *e.g.,* do we even dare to think about including plants/basal producers (see *e.g.,* @valdovinosBioenergeticFrameworkAboveground2023)\n    * Taxonomic harmonisation - something to think about and check\n4. compare model performance based on the ideas currently listed in the results section.\n5. Make a pretty picture that summarises things - maybe overlapping Venn circles that showcase which models do well in the different spheres/aspects of life\n\n## Results\n\nHow we want to compare and contrast. I think there won't be a 'winner' and thus we need to think of 'tests' that are going to measure performance in different situations/settings. With that in mind I think some valuable points to consider would be:\n\n* Structural vs pairwise link predictions (graph vs node level)\n  * % of links correctly retrieved\n  * connectance\n  * trophic level\n  * generalism vs specialism\n  * something related to false positives/negatives\n  * intervality\n* Data 'cost' (some methods might need a lot lot of supporting data vs something very light weight)\n* I think it would be remiss to not also take into consideration computational cost\n* something about the network output - I'm acknowledging my biases and saying that probabilistic (or *maybe* weighted) links are the way\n\n@cohenStochasticTheoryCommunity1985 actually tells us that the cascade model only really works for communities that range from 3-33 species... and @williamsSuccessItsLimits2008 also highlights how structural models really only work for small communities\n\n> maybe we can put these into broader categories - if we do start doing the venn overlap thing. *E.g.,* local scale predictions, regional scale predictions, pairwise interactions, structural (energetics), computationally cheap, low cost data\n\n### Qualitative stuff\n\n\n\n\n\n{{< embed notebooks/model_qualitative.qmd#fig-venn >}}\n\n\n\n\n\n\n\n\n\n## Discussion\n\nI think a big take home will (hopefully) be how different approaches do better in different situations and so you as an end user need to take this into consideration and pick accordingly. I think @petcheyFitEfficiencyBiology2011 might have (and share) some thoughts on this (thanks Andrew). I feel like I need to look at @berlowGoldilocksFactorFood2008 but maybe not exactly in this context but vaguely adjacent.\n\nAn interesting thing to also think about (and arguably it will be addressed based on some of the other thoughts and ideas) is data dependant and data independent 'parametrisation' of the models...\n\nI probably think about this point too much but a point of discussion that I think will be interesting to bring up the idea that if a model is missing a specific pairwise link but doing well at the structural level then when does it matter? I think this is covered with the whole node vs graph level performance but I kind of just want to bring it up here again because also one of those things that I think about a bit too much probably...\n\n> Thinking very long term here and maybe a bit beyond the scope but also thinking about a multi- model approach? So in other words using one model to build an initial network but maybe a second one to constrain it a bit better. I blame this thought on the over-connected PFIM food webs...\n\n## References {.unnumbered}\n\n::: {#refs}\n:::\n:::\n\n",
+    "markdown": "---\ntitle: T is for Topology\nauthor:\n  - name: Tanya Strydom\n    id: ts\n    orcid: 0000-0001-6067-1349\n    corresponding: true\n    email: t.strydom@sheffield.ac.uk\n    roles:\n      - Words\n      - Nonsense\n      - Rascality\n      - Visualisation (although absent)\n    affiliation:\n      - id: sheffield\n        name: University of Sheffield\n  - name: Andrew P. Beckerman\n    id: apb\n    orcid: 0000-0002-7859-8394\n    corresponding: false\n    roles: []\n    affiliations:\n      - ref: sheffield\nfunding: 'The author(s) received no specific funding for this work. Well they did I just haven''t done the '\nkeywords:\n  - food web\n  - network construction\nabstract: |\n  Pending...\nplain-language-summary: |\n  We want to know a bit more about the different network topology generators (predict tools) and how they differ - *i.e.,*  their strengths and weaknesses\ndate: last-modified\nbibliography: references.bib\ncitation:\n  container-title: Some fancy journal\nnumber-sections: true\n---\n\n:::{#d09af4d8 .cell .markdown}\n## Introduction\n\nThe standard run of the mill that we cannot always feasibly construct networks because 1. hard, 2. time (yay dinosaurs, but also the future and impending doom I guess), and 3. probably something else meaningful that's just slipping my mind at the moment. Some of the usual culprits will come in here like: @jordanoSamplingNetworksEcological2016; [@jordanoChasingEcologicalInteractions2016]; @poisotGlobalKnowledgeGaps2021\n\n> TODO: standardise language between a topology generator and a topology predictor... I guess they can both be considered models but I'm not quite sure...\n\n### Philosophical contemplation when constructing interaction networks\n\n#### Why do we want to construct an interaction network?\n\nArguably the need for methods and tools for constructing interaction networks arises from two different (but still aligned) places of interest within the field of network ecology. On the one side of the spectrum sits the researcher who is interested in generating a set of ecologically plausible but not necessarily realised 'in the field' for the purpose of running further simulations (*e.g.,* extinction sim REF? **TODO**) or understanding some higher-level process (*e.g.,* energetics REF **TODO**). This researcher is contrasted by one that is interested in constructing real-world, location specific interactions networks in lieu of having access to data generated in the field (see @strydomRoadmapPredictingSpecies2021 for a discussion on this). Of course these two categories are not distinct, mutually exclusive, groups but can rather be viewed as operating on a gradient ranging from a need for generality (*something*) to a need for specificity (local-level predictions) when it comes to the quality(?) of the network that is constructed by a specific tool. Of course this research need would also reflected in the model development process itself and thus the idea of what a 'good enough' constructed network will be in the context of assessing the performance of a specific model. \n\n@cohenStochasticTheoryCommunity1985 states that *\"[Their] approach is more like gross anatomy than like physiology... that is, the gross anatomy is frozen, rather than in motion.\"*. \n\nInterestingly @williamsSuccessItsLimits2008 also explicitly talk about *structural* food-web models in their introduction... so how I see it that means that there has always been this inherent acknowledgement that models are function at a specific 'network level'.\n\n#### The history behind the approach\n\n\nMaybe a brief history of the development of predictive tools/topo generators? Sort of where the theory/body of work was based and how that has changed? IS there a difference between topo generator and predictive tool - I'm inclined to think that it aligns with the whole debate of high level structure vs node-level perfection\n\nMaybe start here with discussing the core mechanistic differences that models will work at --- some are really concerned about (and thus constrained by) structure, others are more mechanistic in nature *i.e.,* species *a* has the capacity to eat species *b* because traits (read gob size), and then you get @rohrModelingFoodWebs2010 and @strydomFoodWebReconstruction2022 that sit in the weird liminal latent space... \n\nHere I will probably get on my (newly discovered) soapbox and wax lyrical about how in certain situations structure is enough (and that will probably be for some high-level things like thinking about energy flows etc., I can also see a world in which maybe you want to do some sort of robustness/extinction work - since then you're usually doing 'random' (within limits) extinctions) but there may be use cases where we are really interested in the node-level interactions *i.e.,* species identity is like a thing we need to care about and also be able to retrieve specific interactions at specific nodes correctly. What is the purpose of generating a network? Is it an element of a bigger question we are asking, *e.g.,* I want to generate a series of networks to do some extinction simulations/bioenergetic stuff OR are we looking for a 'final product' network that is relevant to a specific location? (this can still be broad in geographic scope). \n\nAt some point we are going to need to discuss the key differences and implications between predicting a metaweb (*sensu* @dunneNetworkStructureFood2006) and a network realisation. And here I can't help but think about @poisotSpeciesWhyEcological2015 (and probably other papers) that discuss how the local factors are going to play a role and even the same pair of species may interact differently in different points in the landscape.\n\n> Do we need to delve into individual-based networks? (*sensu* Tinker 2012, Araújo 2008) I think its probably a step too far and one starts creeping into apples and pears type of comparisons. Especially since these work off of already existing networks (I seem to recall) and its more about about 'tweaking' those - so not so much *de novo* predictions. Although this might be useful to keep in mind when it comes to re-wiring... Also on that note do we opn the re-wiring door here in this ms or wait it out a bit.\n\n## Data & Methods {#sec-data-methods}\n\n### Overview of models\n\n#### Structural models\n\n**Random model** [@erdosRandomGraphs1959]: Links are assembled randomly, not developed within an ecological framework. But of course could still hold if we assume that communities are randomly assembled in terms of who is interacting with who (I seem to think that's sort of what May was arguing but I would need to remind myself)\n\n**Cascade model** [@cohenCommunityFoodWebs1990]: Much like the name suggests the cascade model rests on the idea that species feed on one another in a hierarchical manner. This rests on the assumption that the links within a network are variably distributed across the network; with the proportion of links decreasing as one moves up the trophic levels (*i.e.,* 'many' prey and 'few' predators). This is achieved by assigning all species a random rank, this rank will then determine both the predators and prey of that species. A species will have a particular probability of being fed on by any species with a higher ranking than it, this probability is constrained by the specified connectance of the network. Interestingly here 'species' are treated as any individual that consume and are consumed by the same 'species', *i.e.,* these are not taxonomical species [@cohenStochasticTheoryCommunity1985]. The original cascade model has altered to be more 'generalised' [@stoufferQuantitativePatternsStructure2005], which altered the probability distribution of the prey that could be consumed by a species. \n\n**Niche model** [@williamsSimpleRulesYield2000]: The niche model introduces the idea that species interactions are based on the 'feeding niche' of a species. Broadly, all species are randomly assigned a 'feeding niche' range and all species that fall in this range can be consumed by that species (thereby allowing for cannibalism). The niche of each species is randomly assigned and the range of each species' niche is (in part) constrained by the specified connectance of the network. The niche model has also been modified, although it appears that adding to the 'complexity' of the niche model does not improve on its ability to generate a more ecologically 'correct' network [@williamsSuccessItsLimits2008]. \n\n**Nested hierarchy model** [@cattinPhylogeneticConstraintsAdaptation2004]:\n\n#### Mechanistic models\n\n**Allometric diet breadth model (ADBM)** [@petcheySizeForagingFood2008]:\n\n**Log-ratio** [@rohrModelingFoodWebs2010]: Interestingly often used in paleo settings (at least that's what it currently looks like in my mind... (*e.g.,* @yeakelCollapseEcologicalNetwork2014, @piresMegafaunalExtinctionsHuman2020)\n\n**Stochastic** [@rossbergFoodWebsExperts2006]:\n\n**PFIM** [@shawFrameworkReconstructingAncient2024]:\n\n**Trait-based** [@caronAddressingEltonianShortfall2022]:\n\n**Graph embedding** [@strydomFoodWebReconstruction2022; @strydomGraphEmbeddingTransfer2023]: At a high level graph embedding focuses on capturing the structural data of a network as opposed to a list of pairwise (*i.e.,* mechanistic) interactions. Here specifically the embedding is preformed on a known interaction network and captures information as to where species (nodes) are positioned in a network *e.g.,* are they basal prey species or top predators, similar to the log ratio model. In  @strydomFoodWebReconstruction2022 the products of the embedding process are fed into a transfer leanring framework for novel prediction...\n\nI know tables are awful but in this case they may make more sense. Also I don't think I'm at the point where I can say that the table is complete/comprehensive but it getting there Not sure about putting in some papers that have used the model - totes happy to drop those I think...\n\n| Model               | Core Mechanism | End product | Specificity      | Interaction  | Data |\n|---------------------|----------------|-------------|------------------|--------------|------|\n| random              | structural     | network     | species agnostic | binary       |\n| cascade             | structural     | network     | species agnostic | binary       |\n| niche               | structural     | network     | species agnostic | binary       |\n| nested hierarchical | structural     | network     | species agnostic | binary       |\n| ADBM                | mechanistic    |             | energetics       | quantitative |\n| log-ratio           | pondering...   |             |                  |              |\n| PFIM                | mechanistic    | metaweb     | trait based      |pondering...  |\n| graph embedding     | embedding      | metaweb     | evolutionary     |probabilistic |\n| trait model         | mechanistic    | metaweb     | trait based      |              |\n| stochastic          |                |             |                  |              |\n\n: Lets make a table that gives an overview of the different topology generators that we will look at {#tbl-history}\n\n> Might be nice to have a little appendix/supp mat that breaks down the models in detail so that they are all in one place so that someone (grad student being told they need to build networks) some day can go and educate themselves with slightly lower effort. This will also be useful for me should I end up having to do some actual coding - think of this as step one in the pseudo code process.\n\n### Datasets used\n\nHere I think we need to span a variety of domains, at minimum aquatic and terrestrial but maybe there should be a 'scale' element as well *i.e.,* a regional and local network. I think there is going to be a 'turning point' where structural will take over from mechanistic in terms of performance. More specifically at local scales bioenergetic constraints (and co-occurrence) may play a bigger role in structuring a network whereas at the metaweb level then mechanistic may make more (since by default its about who can potentially interact and obviously not constrained by real-world scenarios) *sensu* @caronTrophicInteractionModels2023. Although having said that I feel that contradicts the idea of backbones (*sensu* Bramon Mora (sp?) et al & Stouffer et al) But that might be where we get the idea of core *structure* vs something like linkage density. So core things like trophic level/chain length will be conserved but connectance might not (I think I understand what I'm trying to say here)\n\nI think we should also use the Dunne (I think) Cambrian (also think) network (I was correct and its this one @dunneCompilationNetworkAnalyses2008). Because 1) it gives the paleo-centric methods their moment in the sun and 2) I think it also brings up the interesting question of can we use modern structure to predict past ones? Here one might expect a more mechanistic approach to shine.\n\nDraw the other datasets from `Mangal` because they will be nicely formatted and essentially at point and shoot level\n\n### Comparing different models\n\nFor now the (still essentially pending) workflow/associated code can be found at the following repository [BecksLab/topology_generators](https://github.com/BecksLab/topology_generators)\n\n1. Shortlist/finalise the different topo generators\n2. collate/translate into `Julia`\n    * *e.g.,* some models wil be in SpeciesInteractionNetworks.jl (new EcoNet); I know (parts of) the transfer learning stuff is and the niche model\n    * others will need to be coded out (the more simpler models should be easier)\n    * can also consider `R` but then it becomes a case of porting things left and right depending on how we decide to do the post analyses\n3. Curate networks for the different datasets/scenarios we select - I feel like there might be some scenarios that we can't do all models for all datasets but maybe I'm being a pessimist.\n    * Need to also think about where one might find the additional data for some of the models...\n        * Body size: @herbersteinAnimalTraitsCuratedAnimal2022 - Although maybe Andrew has strong thotsTM RE the one true body size database to rule them all...\n        * Other trait sources: @wilmanEltonTraitsSpecieslevelForaging2014 and @jonesPanTHERIASpecieslevelDatabase2009\n        * This is where we'll get the paleo traits from if I'm correct @bambachAutecologyFillingEcospace2007\n        * Phylogeny stuff: @uphamInferringMammalTree2019 (what we used for TL but its only mammals...) but I'm sure there will be others\n    * Also limitation of scope... *e.g.,* do we even dare to think about including plants/basal producers (see *e.g.,* @valdovinosBioenergeticFrameworkAboveground2023)\n    * Taxonomic harmonisation - something to think about and check\n4. compare model performance based on the ideas currently listed in the results section.\n5. Make a pretty picture that summarises things - maybe overlapping Venn circles that showcase which models do well in the different spheres/aspects of life\n\n## Results\n\nHow we want to compare and contrast. I think there won't be a 'winner' and thus we need to think of 'tests' that are going to measure performance in different situations/settings. With that in mind I think some valuable points to consider would be:\n\n* Structural vs pairwise link predictions (graph vs node level)\n  * % of links correctly retrieved\n  * connectance\n  * trophic level\n  * generalism vs specialism\n  * something related to false positives/negatives\n  * intervality\n* Data 'cost' (some methods might need a lot lot of supporting data vs something very light weight)\n* I think it would be remiss to not also take into consideration computational cost\n* something about the network output - I'm acknowledging my biases and saying that probabilistic (or *maybe* weighted) links are the way\n\n@cohenStochasticTheoryCommunity1985 actually tells us that the cascade model only really works for communities that range from 3-33 species... and @williamsSuccessItsLimits2008 also highlights how structural models really only work for small communities\n\n> maybe we can put these into broader categories - if we do start doing the venn overlap thing. *E.g.,* local scale predictions, regional scale predictions, pairwise interactions, structural (energetics), computationally cheap, low cost data\n\n### Qualitative stuff\n\n\n\n\n\n{{< embed notebooks/model_qualitative.qmd#fig-venn >}}\n\n\n\n\n\n\n\n\n\n## Discussion\n\nI think a big take home will (hopefully) be how different approaches do better in different situations and so you as an end user need to take this into consideration and pick accordingly. I think @petcheyFitEfficiencyBiology2011 might have (and share) some thoughts on this (thanks Andrew). I feel like I need to look at @berlowGoldilocksFactorFood2008 but maybe not exactly in this context but vaguely adjacent.\n\nAn interesting thing to also think about (and arguably it will be addressed based on some of the other thoughts and ideas) is data dependant and data independent 'parametrisation' of the models...\n\nI probably think about this point too much but a point of discussion that I think will be interesting to bring up the idea that if a model is missing a specific pairwise link but doing well at the structural level then when does it matter? I think this is covered with the whole node vs graph level performance but I kind of just want to bring it up here again because also one of those things that I think about a bit too much probably...\n\n> Thinking very long term here and maybe a bit beyond the scope but also thinking about a multi- model approach? So in other words using one model to build an initial network but maybe a second one to constrain it a bit better. I blame this thought on the over-connected PFIM food webs...\n\n## References {.unnumbered}\n\n::: {#refs}\n:::\n:::\n\n",
     "supporting": [
       "index_files/figure-html"
     ],
diff --git a/index.qmd b/index.qmd
index 21671f5..f9e2c3c 100644
--- a/index.qmd
+++ b/index.qmd
@@ -92,22 +92,22 @@ At some point we are going to need to discuss the key differences and implicatio
 
 **Trait-based** [@caronAddressingEltonianShortfall2022]:
 
-**Graph embedding** [@strydomGraphEmbeddingTransfer2023]: *e.g.,* [@strydomFoodWebReconstruction2022]
+**Graph embedding** [@strydomFoodWebReconstruction2022; @strydomGraphEmbeddingTransfer2023]: At a high level graph embedding focuses on capturing the structural data of a network as opposed to a list of pairwise (*i.e.,* mechanistic) interactions. Here specifically the embedding is preformed on a known interaction network and captures information as to where species (nodes) are positioned in a network *e.g.,* are they basal prey species or top predators, similar to the log ratio model. In  @strydomFoodWebReconstruction2022 the products of the embedding process are fed into a transfer leanring framework for novel prediction...
 
 I know tables are awful but in this case they may make more sense. Also I don't think I'm at the point where I can say that the table is complete/comprehensive but it getting there Not sure about putting in some papers that have used the model - totes happy to drop those I think...
 
-| Model               | Core Mechanism | End product | Specificity      | Interaction |
-|---------------------|----------------|-------------|------------------|-------------|
-| random              | structural     | network     | species agnostic | binary      |
-| cascade             | structural     | network     | species agnostic | binary      |
-| niche               | structural     | network     | species agnostic | binary      |
-| nested hierarchical | structural     | network     | species agnostic | binary      |
-| ADBM                | mechanistic    |             | energetics       | quantitative|
-| log-ratio           | pondering...   |             |                  |             |
-| PFIM                | mechanistic    | metaweb     | trait based      |pondering... |
-| graph embedding     | embedding      | metaweb     | evolutionary     |probabilistic|
-| trait model         | mechanistic    | metaweb     | trait based      |             |
-| stochastic          |                |             |                  |             |
+| Model               | Core Mechanism | End product | Specificity      | Interaction  | Data |
+|---------------------|----------------|-------------|------------------|--------------|------|
+| random              | structural     | network     | species agnostic | binary       |
+| cascade             | structural     | network     | species agnostic | binary       |
+| niche               | structural     | network     | species agnostic | binary       |
+| nested hierarchical | structural     | network     | species agnostic | binary       |
+| ADBM                | mechanistic    |             | energetics       | quantitative |
+| log-ratio           | pondering...   |             |                  |              |
+| PFIM                | mechanistic    | metaweb     | trait based      |pondering...  |
+| graph embedding     | embedding      | metaweb     | evolutionary     |probabilistic |
+| trait model         | mechanistic    | metaweb     | trait based      |              |
+| stochastic          |                |             |                  |              |
 
 : Lets make a table that gives an overview of the different topology generators that we will look at {#tbl-history}