Skip to content
This repository has been archived by the owner on Jun 21, 2023. It is now read-only.

Latest commit

 

History

History
32 lines (20 loc) · 7.16 KB

08.supplemental.md

File metadata and controls

32 lines (20 loc) · 7.16 KB

Supplemental Information Titles and Legends

OpenPBTA Project Workflow, Related to Figure 1. Biospecimens and data were collected by CBTN and PNOC. Genomic sequencing and harmonization (orange boxes) were performed by the Kids First Data Resource Center (KFDRC). Analyses in the green boxes were performed by contributors of the OpenPBTA project. Output files are denoted in blue. Figure created with BioRender.com.{#fig:S1 tag="S1" width="7in"}

Validation of Consensus SNV calls and Tumor Mutation Burden, Related to Figures 2 and 3. Correlation (A) and violin (B) plots of mutation variant allele frequencies (VAFs) comparing the variant callers (Lancet, Strelka2, Mutect2, and VarDict) used for PBTA samples. UpSet plot (C) showing overlap of variant calls. Correlation (D) and violin (E) plots of mutation variant allele frequencies (VAFs) comparing the variant callers (Lancet, Strelka2, and Mutect2) used for TCGA samples. UpSet plot (F) showing overlap of variant calls. Violin plots (G) showing VAFs for Lancet calls performed on WGS and WXS from the same tumor (N = 52 samples from 13 patients). Cumulative distribution TMB plots for PBTA (H) and TCGA (I) tumors using consensus SNV calls.{#fig:S2 tag="S2" width="7in"}

Genomic instability of pediatric brain tumors, Related to Figures 2 and 3. (A) Violin plots of tumor purity by cancer group. Dots represent the group median. (B) Oncoprint of canonical somatic gene mutations, CNVs, fusions, and TMB (top bar plot) for the top mutated genes across rare CNS tumors: desmoplastic infantile astrocytoma and ganglioglioma (N = 2), germinoma (N = 4), glial-neuronal NOS (N = 8), metastatic secondary tumors (N = 2), neurocytoma (N = 2), pineoblastoma (N = 4), Rosai-Dorfman disease (N = 2), and sarcomas (N = 4). Patient sex (Germline sex estimate) and tumor histology (Cancer Group) are displayed as annotations at the bottom of each plot. Multiple CNVs are denoted as a complex event. N denotes the number of unique tumors with one tumor per patient used. (C) Genome-wide plot of CNV alterations by broad histology. Each row represents one sample. Box and whisker plots of number of CNV breaks (D) or SV breaks (E) by number of chromothripsis regions. Box plot represents 5% (lower whisker), 25% (lower box), 50% (median), 75% (upper box), and 95% (upper whisker) quantiles.{#fig:S3 tag="S3" width="7in"}

Mutational signatures in pediatric brain tumors, Related to Figure 3. (A) Sample-specific RefSig signature weights across cancer groups ordered by decreasing Signature 1 exposure. (B) Proportion of Signature 1 plotted by phase of therapy for each cancer group.{#fig:S4 tag="S4" width="7in"}

Quality control metrics for TP53 and EXTEND scores, Related to Figure 4. (A) Receiver Operating Characteristic for TP53 classifier run on FPKM of poly-A RNA-Seq samples. Correlation plots for telomerase scores (EXTEND) with RNA expression of TERT (B) and TERC (C). Red dots in B and C denote samples with known TERT promoter (TERTp) mutations.{#fig:S5 tag="S5" width="7in"}

Subtype-specific clustering and immune cell fractions, Related to Figure 5. First two dimensions from UMAP of sample transcriptome data with points colored by molecular_subtype for medulloblastoma (A), ependymoma (B), low-grade glioma (C), and high-grade glioma (D). (E) Box plots of quanTIseq estimates of immune cell fractions in histologies with more than one molecular subtype with N >=3. (F) Box plots of the ratio of immune cell fractions of CD8+ to CD4+ T cells in histologies with more than one molecular subtype with N >=3. Box plot represents 5% (lower whisker), 25% (lower box), 50% (median), 75% (upper box), and 95% (upper whisker) quantiles.{#fig:S6 tag="S6" width="7in"}

RNA batch and tumor purity assessment, Related to Figures 4 and 5. Bar plot (A) and UMAP (B) of RNA-Seq samples by cancer group and library preparation method. (C) UMAP of RNA-Seq samples by cancer group and sequencing center. For (D-I), RNA-Seq samples were thresholded by median cancer group tumor purity and transcriptomic analyses in Figure {@fig:Fig4}A-D (D-G) and Figure {@fig:Fig5}A,C (H-I) were repeated.{#fig:S7 tag="S7" width="7in"}

Table S1. Related to Figure 1. Table of specimens and associated metadata, clinical data, and histological data utilized in the OpenPBTA project.

Table S2. Related to Figures 2 and 3. Excel file with four sheets, where the first three represent tables of TMB, eight CNS mutational signatures, and chromothripsis events per sample, respectively, and the fourth sheet shows summarized genomic alterations across cancer groups.

Table S3. Related to Figures 4 and 5. Excel file with three sheets representing tables of TP53 scores, telomerase EXTEND scores, and quanTIseq immune scores, respectively.

Table S4. Related to Figures 4 and 5. Excel file with six sheets representing the survival analyses performed for this manuscript. See Star Methods for details.

Table S5. Related to Figure 1. Excel file with four sheets representing of all software and their respective versions used for the OpenPBTA project, including the R packages in the OpenPBTA Docker image, Python packages i the OpenPBTA Docker image, other command line tools in the OpenPBTA Docker image, and all software used in the OpenPBTA workflows, respectively. Note that all software in the OpenPBTA Docker image was utilized within the analysis repository, but not all software was used for the final manuscript.