Skip to content

Commit

Permalink
Updates to website
Browse files Browse the repository at this point in the history
  • Loading branch information
rcalef committed Dec 10, 2024
1 parent 56cdb0a commit 1541444
Show file tree
Hide file tree
Showing 2 changed files with 25 additions and 23 deletions.
48 changes: 25 additions & 23 deletions docs/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ <h1 class="title is-1 publication-title">A multimodal foundation model for prote
<a href="https://www.linkedin.com/in/robert-calef/" target="_blank">Robert Calef</a><sup>1,2,*</sup>,
</span>
<span class="author-block">
<a href="#" target="_blank">Valentina Giunchiglia</a><sup>1,3,4</sup>,
<a href="https://valegiunchiglia.github.io/personal_website/" target="_blank">Valentina Giunchiglia</a><sup>1,3,4</sup>,
</span>
<span class="author-block">
<a href="#" target="_blank">Tianlong Chen</a><sup>1,2</sup>,
Expand All @@ -84,31 +84,31 @@ <h1 class="title is-1 publication-title">A multimodal foundation model for prote
<a href="#" target="_blank">Ayush Noori</a><sup>1</sup>,
</span>
<span class="author-block">
<a href="#" target="_blank">Joseph Brown</a><sup>5</sup>,
<a href="#" target="_blank">Joseph Brown</a><sup>5,&dagger;</sup>,
</span>
<span class="author-block">
<a href="#" target="_blank">Tom Cobley</a><sup>2</sup>,
<a href="#" target="_blank">Tom Cobley</a><sup>2,6</sup>,
</span>
<span class="author-block">
<a href="#" target="_blank">Karin Hrovatin</a><sup>6,7</sup>,
<a href="#" target="_blank">Karin Hrovatin</a><sup>7,8</sup>,
</span>
<span class="author-block">
<a href="https://www.tomhartvigsen.com/" target="_blank">Tom Hartvigsen</a><sup>8</sup>,
<a href="https://www.tomhartvigsen.com/" target="_blank">Tom Hartvigsen</a><sup>9</sup>,
</span>
<span class="author-block">
<a href="https://www.helmholtz-munich.de/en/icb/research-groups/theis-lab" target="_blank">Fabian J. Theis</a><sup>6,9</sup>,
<a href="https://www.helmholtz-munich.de/en/icb/research-groups/theis-lab" target="_blank">Fabian J. Theis</a><sup>7,10</sup>,
</span>
<span class="author-block">
<a href="https://pentelutelabmit.com/" target="_blank">Bradley Pentelute</a><sup>5,13</sup>,
<a href="https://pentelutelabmit.com/" target="_blank">Bradley Pentelute</a><sup>5,14</sup>,
</span>
<span class="author-block">
<a href="https://khuranalab.bwh.harvard.edu/" target="_blank">Vikram Khurana</a><sup>10,11,13</sup>,
<a href="https://khuranalab.bwh.harvard.edu/" target="_blank">Vikram Khurana</a><sup>11,12,14</sup>,
</span>
<span class="author-block">
<a href="https://compbio.mit.edu/" target="_blank">Manolis Kellis</a><sup>2,13</sup>,
<a href="https://compbio.mit.edu/" target="_blank">Manolis Kellis</a><sup>2,14</sup>,
</span>
<span class="author-block">
<a href="https://zitniklab.hms.harvard.edu/" target="_blank">Marinka Zitnik</a><sup>1,12,13,14,&Dagger;</sup>
<a href="https://zitniklab.hms.harvard.edu/" target="_blank">Marinka Zitnik</a><sup>1,13,14,15,&Dagger;;</sup>
</span>
</div>

Expand All @@ -118,20 +118,22 @@ <h1 class="title is-1 publication-title">A multimodal foundation model for prote
<span class="author-block"><sup>1</sup>Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA</span>
<span class="author-block"><sup>2</sup>Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge, MA, USA</span>
<span class="author-block"><sup>3</sup>Department of Brain Sciences, Imperial College London, London, UK</span>
<span class="author-block"><sup>4</sup>Centre for Neuroimaging Science, King's College London, London, UK</span>
<span class="author-block"><sup>4</sup>Centre for Neuroimaging Sciences, Kings College London, London, UK</span>
<span class="author-block"><sup>5</sup>Department of Chemistry, MIT, Cambridge, MA, USA</span>
<span class="author-block"><sup>6</sup>Institute of Computational Biology, Helmholtz Center Munich, Munich, Germany</span>
<span class="author-block"><sup>7</sup>TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany</span>
<span class="author-block"><sup>8</sup>School of Data Science, University of Virginia, VA, USA</span>
<span class="author-block"><sup>9</sup>School of Computation, Information and Technology, Technical University of Munich, Garching, Germany</span>
<span class="author-block"><sup>10</sup>Department of Neurology, Brigham and Women's Hospital, Boston, MA, USA</span>
<span class="author-block"><sup>11</sup>Harvard Stem Cell Institute, Cambridge, MA, USA</span>
<span class="author-block"><sup>12</sup>Kempner Institute for the Study of Natural and Artificial Intelligence, Harvard University, MA, USA</span>
<span class="author-block"><sup>13</sup>Broad Institute of MIT and Harvard, Cambridge, MA, USA</span>
<span class="author-block"><sup>14</sup>Harvard Data Science Initiative, Cambridge, MA, USA</span>
<span class="author-block"><sup>6</sup>Department of Computing, Imperial College London, London, UK</span>
<span class="author-block"><sup>7</sup>Institute of Computational Biology, Computational Health Center, Helmholtz Munich, Munich, Germany</span>
<span class="author-block"><sup>8</sup>TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany</span>
<span class="author-block"><sup>9</sup>School of Data Science, University of Virginia, VA, USA</span>
<span class="author-block"><sup>10</sup>School of Computation, Information and Technology, Technical University of Munich, Garching, Germany</span>
<span class="author-block"><sup>11</sup>Department of Neurology, Brigham and Women’s Hospital, Boston, MA, USA</span>
<span class="author-block"><sup>12</sup>Harvard Stem Cell Institute, Cambridge, MA, USA</span>
<span class="author-block"><sup>13</sup>Kempner Institute for the Study of Natural and Artificial Intelligence, Harvard University, MA, USA</span>
<span class="author-block"><sup>14</sup>Broad Institute of MIT and Harvard, Cambridge, MA, USA</span>
<span class="author-block"><sup>15</sup>Harvard Data Science Initiative, Cambridge, MA, USA</span>

<span class="eql-cntrb"><small><br><sup>*</sup>Co-first authors</small></span>
<span class="eql-cntrb"><small><br><sup>+</sup>Present address: Department of Computer Science, Stanford University, Stanford, CA, USA</small></span>
<span class="eql-cntrb"><small><br><sup>&dagger;</sup>Present address: Acceleration Consortium, University of Toronto, Toronto, ON, Canada</small></span>
<span class="eql-cntrb"><small><br><sup>&Dagger;</sup>Corresponding author. Email: marinka@hms.harvard.edu</small></span>
</div>
</details>
Expand Down Expand Up @@ -229,7 +231,7 @@ <h2 class="subtitle has-text-left is-size-5 has-text-weight-normal">
<h2 class="title is-3">Abstract</h2>
<div class="content has-text-justified">
<p>
Understanding the roles of human proteins remains a major challenge, with approximately 20\% of human proteins lacking known functions and more than 40\% missing context-specific functional insights. Even well-annotated proteins are often poorly characterized in diverse biological contexts, disease states, and perturbations.
Understanding the roles of human proteins remains a major challenge, with approximately 20% of human proteins lacking known functions and more than 40% missing context-specific functional insights. Even well-annotated proteins are often poorly characterized in diverse biological contexts, disease states, and perturbations.
We present ProCyon, a foundation model for modeling, generating, and predicting protein phenotypes across five interrelated knowledge domains: molecular functions, therapeutic mechanisms, disease associations, functional protein domains, and molecular interactions. To support this, we created ProCyon-Instruct, a dataset of 33 million protein phenotype instructions, representing a comprehensive resource for multiscale protein phenotypes.
By co-training a large language model with multimodal molecular encoders, ProCyon integrates phenotypic and protein data. A novel architecture and instruction tuning strategy allow ProCyon to process arbitrarily interleaved protein-and-phenotype inputs, achieve zero-shot task transfer, and generate free-form text phenotypes interleaved with retrieved protein sequence, structure, and drug modalities in a single unified model.
ProCyon achieves strong performance against single-modality models, multimodal models such as ESM3, as well as text-only LLMs on dozens of benchmarking tasks such as contextual protein retrieval and question answering.
Expand Down Expand Up @@ -311,7 +313,7 @@ <h2 class="title is-2 has-text-centered">Performance Comparison</h2>
<h2 class="subtitle" style="max-width: 75%; margin: 0 auto; text-align: left;">
ProCyon shows strong performance on a benchmark of fourteen biologically-relevant tasks constructed from ProCyon-Instruct and framed
as either question-answering or protein retrieval tasks.
ProCyon is the only model to consistently out perform both single-modality and multi-modality models across tasks. We also
ProCyon is the only model to consistently outperform both single-modality and multi-modality models across tasks. We also
find that <strong>ProCyon maintains strong performance on 3,250 completely unseen phenotypes across knowledge domains</strong>, showing
its ability to reason over novel scientific concepts.
</h2>
Expand All @@ -334,7 +336,7 @@ <h2 class="title is-3 has-text-centered">Free-text protein retrieval</h2>
<img src="static/images/sting_fig_v2.png" alt="ProCyon STING figure" style="height: 400px; width: auto;">
<section class="section hero is-light">
<h2 class="subtitle" style="max-width: 75%; margin: 0 auto; text-align: left;">
ProCyon is able to successfully retrieve the STING protein given functional queries related to neuronal inflammatory stres response,
ProCyon is able to successfully retrieve the STING protein given functional queries related to neuronal inflammatory stress response,
a role of STING that was only described in <a href="https://pubmed.ncbi.nlm.nih.gov/38878778/">scientific literature </a>published after
ProCyon's training data cutoff date. Increasingly precise and functionally-relevant descriptions increases the retrieval rank of STING,
showing <strong>ProCyon's ability to assist in the scientific discovery process.</strong>
Expand Down
Binary file modified docs/static/images/model_use_cases.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 1541444

Please sign in to comment.