The CBTN released the PBTA raw genomic data in September 2018 without embargo, allowing researchers immediate access to begin making discoveries on behalf of children with CNS tumors everywhere. Since this publication, the CBTN has approved over 200 data research projects [@doi:10.1016/j.neo.2022.100846] from 69 different institutions, with 60% from non-CBTN sites. We created OpenPBTA as an open, real-time, reproducible analysis framework to genomically characterize pediatric brain tumors, bringing together basic and translational researchers, clinicians, and data scientists. We provide reusable code and data resources, paired with cloud-based availability of source and derived data resources, to the pediatric oncology community, encouraging interdisciplinary collaboration. To our knowledge, this initiative represents the first large-scale, collaborative, open analysis of genomic data coupled with open manuscript writing, wherein we comprehensively analyzed the PBTA cohort. Using available WGS, WXS, and RNA-Seq data, we generated high-confidence consensus SNV and CNV calls, prioritized putative oncogenic fusions, and established over 40 scalable and rigorously-reviewed modules to perform common downstream cancer genomics analyses. We detected expected patterns of genomic lesions, mutational signatures, and aberrantly regulated signaling pathways across multiple pediatric brain tumor histologies.
Assembling large, pan-histology cohorts of fresh frozen samples and associated clinical phenotypes and outcomes requires a multi-year, multi-institutional framework, like those provided by CBTN and PNOC.
As such, uniform clinical molecular subtyping was largely not performed for this cohort at the time of sample collection.
Since DNA methylation data for these samples were not yet available to classify molecular subtypes, we created RNA- and DNA-based subtyping modules aligned with WHO molecularly-defined diagnoses.
We worked closely with pathologists and clinicians to assign research-grade integrated diagnoses for 60% of tumors while discovering incorrectly diagnosed or mis-identified samples in the OpenPBTA cohort.
For example, we subtyped medulloblastoma tumors, of which only 35% (43/122) had prior subtype information from pathology reports, using MMS2
or MedulloClassifier
[@doi:10.1186/s13029-016-0053-y; @doi:10.1371/journal.pcbi.1008263] and subsequently applied the consensus of these methods to subtype all medulloblastomas.
We advanced the integrative analyses and cross-cohort comparison via a number of validated modules. We used an expression classifier to determine whether tumors have dysfunctional TP53 [@doi:10.1016/j.celrep.2018.03.076] and the EXTEND algorithm to determine their degree of telomerase activity using a 13-gene signature [@doi:10.1038/s41467-020-20474-9]. Interestingly, we found that hypermutant HGGs universally displayed TP53 dysregulation, unlike adult cancers like colorectal cancer and gastric adenocarcinoma where TP53 dysregulation in hypermutated tumors is less common [@doi:10.18632/oncotarget.22783; @https://doi.org/10.1038/NATURE13480]. Furthermore, high TP53 scores were a significant prognostic marker for poor overall survival for patients with tumor types including H3 K28-mutant DMGs and ependymomas. We also show that EXTEND scores are a robust surrogate measure for telomerase activity in pediatric brain tumors. By assessing TP53 and telomerase activity prospectively from expression data, information usually only attainable with DNA sequencing and/or qPCR, we incorporated oncogenic biomarker and prognostic knowledge thereby expanding our biological understanding of these tumors.
We identified enrichment of hallmark cancer pathways and characterized the immune cell landscape across pediatric brain tumors, demonstrating tumors in some histologies, such as schwannomas, craniopharyngiomas, and low-grade gliomas, may have a inflammatory tumor microenvironment. Notably, we observed upregulation of IFN$\gamma$, IL-1, and IL-6, and TNF$\alpha$ in craniopharyngiomas, tumors difficult to resect due to their anatomical location and critical surrounding structures. Neurotoxic side effects have been reported in response to IFN$\alpha$ immunotherapy [@doi:10.3171/2015.2.PEDS14656; @doi:10.5348/ijcri-2013-12-419-CR-13], leading researchers to propose additional immune vulnerabilities, such as IL-6 inhibition and immune checkpoint blockade, as cystic adamantinomatous craniopharyngiomas therapies [@doi:10.1093/neuonc/noy035; @pubmed:34966342; @pubmed:32075140; @doi:10.1007/s00401-018-1830-2; @doi:10.3389/fonc.2019.00791]. Our results support this endeavor. Finally, we reproduced the overall known poor infiltration of CD8+ T cells and general low expression of CD274 (PD-L1) in pediatric brain tumors, highlighting that we urgently need novel therapeutic strategies for tumors unlikely to respond to immune checkpoint blockade therapy.
While large-scale collaborative efforts may take a longer time to complete, adoption an open science framework substantially mitigated this concern. By maintaining all data, analytical code, and results in public repositories, we ensured that such logistics did not hinder progress in pediatric cancer research. Indeed, OpenPBTA is already a foundational data analysis and processing layer for several discovery research and translational projects which will continue to add other genomic modalities and analyses, including germline, epigenomic, single-cell, splicing, imaging, and model drug response data. For example, the OpenPBTA RNA fusion filtering module led to the development of the R package annoFuse [@doi:10.1186/s12859-020-03922-7] and an R Shiny application shinyFuse. Leveraging OpenPBTA's medulloblastoma subtyping and immune deconvolution analyses, Dang and colleagues showed that SHH tumors are enriched with monocyte and microglia-derived macrophages, which may accumulate following radiation therapy [@doi:10.1016/j.celrep.2021.108917]. Expression and CNV analyses demonstrated that GPC2 is a highly expressed and copy-number gained immunotherapeutic target in ETMRs, medulloblastomas, choroid plexus carcinomas, H3 wildtype high-grade gliomas, and DMGs. Foster and colleagues therefore developed a chimeric antigen receptor (CAR) directed against GPC2, which shows preclinical efficacy in mouse models [@doi:10.1136/jitc-2021-004450]. Another study harnessed OpenPBTA to integrate germline variants, discovering that pediatric HGG patients with alternative telomere lengthening are enriched for pathogenic or likely pathogenic germline variants in the MMR pathway, possess oncogenic ATRX mutations and have increased TMB [@doi:10.1093/neuonc/noac278]. Moreover, OpenPBTA has enabled a framework to support real-time integration of clinical trial subjects as they enrolled on the PNOC008 high-grade glioma clinical trial [@clinicaltrials:NCT03739372] or PNOC027 medulloblastoma clinical trial [@clinicaltrials:NCT05057702], allowing researchers and clinicians to link tumor biology to translational impact through clinical decision support during tumor board discussions. Finally, as part of the NCI's CCDI, OpenPBTA was recently expanded into OpenPedCan, a pan-pediatric cancer effort (https://github.com/PediatricOpenTargets/OpenPedCan-analysis) which enabled creation of the pediatric Molecular Targets Platform (https://moleculartargets.ccdi.cancer.gov/) in support of the RACE Act. An additional, large-scale cohort of >1,500 tumor samples and associated germline DNA is undergoing harmonization as part of CBTN CCDI-Kids First NCI and Common Fund project (https://commonfund.nih.gov/kidsfirst/2021X01projects#FY21_Resnick) and will be immediately integrated with OpenPBTA data through OpenPedCan. OpenPBTA has paved the way for new modes of collaborative data-driven discovery using open, reproducible, and scalable analyses that will continue to grow over time. We anticipate this foundational work will have an ongoing, long-term impact for pediatric oncology researchers, ultimately accelerating translation and leading to improved outcomes for children with cancer.
All code and processed data are openly available through GitHub, CAVATICA, Zenodo, and PedcBioPortal (see STAR METHODS).