Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rsv: update rsv-b only #273

Merged
merged 2 commits into from
Feb 28, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions data/nextstrain/rsv/b/EPI_ISL_1653999/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,11 @@
## Unreleased

- designate new lineage B.D.4.1.2 and B.D.4.1.3. The former is an older group that doesn't seem to circulating at the moment, but is added for consistency
- designate lineages B.D.E.1.2-6: these are several recent groups defined to break up the dominant B.D.E.1 lineage
- designate lineage B.D.E.5: this is a small group with several substitutions in the F-protein.
- update with new sequence data
- remove deprecated G-clades

## 2024-11-27T02:51:00Z

- update reference tree with more recent data.
Expand Down
4,959 changes: 2,772 additions & 2,187 deletions data/nextstrain/rsv/b/EPI_ISL_1653999/sequences.fasta

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion data/nextstrain/rsv/b/EPI_ISL_1653999/tree.json

Large diffs are not rendered by default.

15 changes: 9 additions & 6 deletions data_output/index.json
Original file line number Diff line number Diff line change
Expand Up @@ -2074,10 +2074,7 @@
"treeJson": "tree.json"
},
"capabilities": {
"clades": 18,
"customClades": {
"G_clade": 10
},
"clades": 26,
"qc": [
"privateMutations",
"mixedSites",
Expand All @@ -2086,6 +2083,13 @@
]
},
"versions": [
{
"tag": "unreleased",
"compatibility": {
"cli": "3.0.0-alpha.0",
"web": "3.0.0-alpha.0"
}
},
{
"updatedAt": "2024-11-27T02:51:00Z",
"tag": "2024-11-27--02-51-00Z",
Expand Down Expand Up @@ -2120,8 +2124,7 @@
}
],
"version": {
"updatedAt": "2024-11-27T02:51:00Z",
"tag": "2024-11-27--02-51-00Z",
"tag": "unreleased",
"compatibility": {
"cli": "3.0.0-alpha.0",
"web": "3.0.0-alpha.0"
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
## Unreleased

- designate new lineage B.D.4.1.2 and B.D.4.1.3. The former is an older group that doesn't seem to circulating at the moment, but is added for consistency
- designate lineages B.D.E.1.2-6: these are several recent groups defined to break up the dominant B.D.E.1 lineage
- designate lineage B.D.E.5: this is a small group with several substitutions in the F-protein.
- update with new sequence data
- remove deprecated G-clades

## 2024-11-27T02:51:00Z

- update reference tree with more recent data.
- include designation of B.D.E.1.1

## 2024-08-01T22:31:31Z

- update of reference tree with additional data. No new clades.


## 2024-01-29T10:29:43Z

- fix definitions of G_clades (legacy) for RSV-A and RSV-B

## 2024-01-16T20:31:02Z

**first release of v3 dataset.**

Updated consortium nomenclature.
20 changes: 20 additions & 0 deletions data_output/nextstrain/rsv/b/EPI_ISL_1653999/unreleased/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# RSV-B dataset with reference genome B/Australia/VIC-RCH056/2019

| Key | Value |
| ---------------------- | --------------------------------------------------------------------------------------------------------------------|
| authors | [Richard Neher](https://neherlab.org), Laura Urbanska, [Nextstrain](https://nextstrain.org) |
| data source | Genbank + authorized other sequences |
| workflow | [github.com/nextstrain/rsv](https://github.com/nextstrain/rsv) |
| nextclade dataset path | nextstrain/rsv/b/EPI_ISL_1653999 |
| reference | [OP975389](https://www.ncbi.nlm.nih.gov/nuccore/OP975389.1) |
| clade definitions | [github.com/rsv-lineages/lineage-designation-B](https://github.com/rsv-lineages/lineage-designation-B) |

## Scope of this dataset
This dataset for RSV-B uses reference sequence B/Australia/VIC-RCH056/2019 with is available at [OP975389](https://www.ncbi.nlm.nih.gov/nuccore/OP975389.1) and also deposited under accession number EPI_ISL_1653999 in GISAID. This sequence has the duplication in the G-protein shared by all currently circulating variants.
The reference tree covers the diversity of RSV-B since the first sequenced samples.


## Nomenclature
The dataset follows the consortium nomenclature established in 2023 that uses a combination of letters and numbers to designate lineages in a hierarchical fashion.
Definitions of individuals lineages are available on github in the repository [rsv-lineages/lineage-designation-B](https://github.com/rsv-lineages/lineage-designation-B).
Legacy clade definitions for the nomenclature defined by Goya et al (`G_clade`) are included for orientation. These clade definitions will not be updated and are incomplete. We encourage users to use the new consortium nomenclature.
Binary file not shown.
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
##gff-version 3
##sequence-region EPI_ISL_1653999 1 15222
EPI_ISL_1653999 feature source 1 15222 . . . mol_type=viral cRNA;organism=Human respiratory syncytial virus B
EPI_ISL_1653999 feature CDS 57 476 . . 0 codon_start=1;locus_tag=NS1;Name=NS1;product=nonstructural protein 1;protein_id=QPB74302.1
EPI_ISL_1653999 feature CDS 584 958 . . 0 codon_start=1;locus_tag=NS2;Name=NS2;product=nonstructural protein 2;protein_id=QPB74303.1
EPI_ISL_1653999 feature CDS 1097 2272 . . 0 codon_start=1;locus_tag=N;Name=N;product=nucleocapsid protein;protein_id=QPB74304.1
EPI_ISL_1653999 feature CDS 2305 3030 . . 0 codon_start=1;locus_tag=P;Name=P;product=phosphoprotein;protein_id=QPB74305.1
EPI_ISL_1653999 feature CDS 3220 3990 . . 0 codon_start=1;locus_tag=M;Name=M;product=matrix protein;protein_id=QPB74306.1
EPI_ISL_1653999 feature CDS 4259 4456 . . 0 codon_start=1;locus_tag=SH;Name=SH;product=small hydrophobic protein;protein_id=QPB74307.1
EPI_ISL_1653999 feature CDS 4646 5578 . . 0 codon_start=1;locus_tag=G;Name=G;product=attachment glycoprotein;protein_id=QPB74308.1
EPI_ISL_1653999 feature CDS 5676 7400 . . 0 codon_start=1;locus_tag=F;Name=F;product=fusion glycoprotein;protein_id=QPB74309.1
EPI_ISL_1653999 feature CDS 7627 8214 . . 0 codon_start=1;locus_tag=M2-1;Name=M2-1;product=transcription elongation factor M2-1;protein_id=QPB74310.1
EPI_ISL_1653999 feature CDS 8180 8452 . . 0 codon_start=1;locus_tag=M2-2;Name=M2-2;product=transcription elongation factor M2-2;protein_id=QPB74311.1
EPI_ISL_1653999 feature CDS 8518 15018 . . 0 codon_start=1;locus_tag=L;Name=L;product=RNA-directed RNA polymerase L;protein_id=QPB74312.1
109 changes: 109 additions & 0 deletions data_output/nextstrain/rsv/b/EPI_ISL_1653999/unreleased/pathogen.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
{
"schemaVersion": "3.0.0",
"alignmentParams": {
"excessBandwidth": 9,
"terminalBandwidth": 100,
"allowedMismatches": 4,
"gapAlignmentSide": "left",
"minSeedCover": 0.1
},
"compatibility": {
"cli": "3.0.0-alpha.0",
"web": "3.0.0-alpha.0"
},
"defaultCds": "F",
"files": {
"changelog": "CHANGELOG.md",
"examples": "sequences.fasta",
"genomeAnnotation": "genome_annotation.gff3",
"pathogenJson": "pathogen.json",
"readme": "README.md",
"reference": "reference.fasta",
"treeJson": "tree.json"
},
"qc": {
"privateMutations": {
"enabled": true,
"typical": 50,
"cutoff": 150,
"weightLabeledSubstitutions": 2,
"weightReversionSubstitutions": 1,
"weightUnlabeledSubstitutions": 1
},
"missingData": {
"enabled": false,
"missingDataThreshold": 2000,
"scoreBias": 500
},
"snpClusters": {
"enabled": false,
"windowSize": 100,
"clusterCutOff": 10,
"scoreWeight": 50
},
"mixedSites": {
"enabled": true,
"mixedSitesThreshold": 8
},
"frameShifts": {
"enabled": true
},
"stopCodons": {
"enabled": true,
"ignoredStopCodons": [
{
"codon": 320,
"cdsName": "G"
}
]
}
},
"cdsOrderPreference": [
"F",
"G",
"L"
],
"maintenance": {
"website": [
"https://nextstrain.org",
"https://clades.nextstrain.org"
],
"documentation": [
"https://github.com/nextstrain/rsv"
],
"source code": [
"https://github.com/nextstrain/rsv"
],
"issues": [
"https://github.com/nextstrain/rsv/issues"
],
"organizations": [
"Nextstrain"
],
"authors": [
"Nextstrain team <https://nextstrain.org>"
]
},
"shortcuts": [
"rsv_b",
"nextstrain/rsv/b",
"nextstrain/rsv/b/hRSV-B-Australia-VIC-RCH056-2019"
],
"attributes": {
"name": "RSV-B",
"reference accession": "EPI_ISL_1653999",
"reference name": "hRSV/B/Australia/VIC-RCH056/2019"
},
"geneOrderPreference": [
"F",
"G",
"L"
],
"version": {
"tag": "unreleased",
"compatibility": {
"cli": "3.0.0-alpha.0",
"web": "3.0.0-alpha.0"
}
}
}
Loading