Releases: nanoporetech/megalodon
Anchor
This release includes a number of new features, optimizations and performance improvements:
- Optimized read processing enabling reference anchored (highest quality) modified base calling the same speed as standalone guppy.
- Live processing mode with all outputs enabled.
- Support for (and new requirement of) Guppy 4.0+.
- Basecall-anchored modified base calls now output in hts-spec unmapped BAM format (see specification here).
- Optimized modified base data base scheme for smaller memory footprint, faster processing and faster aggregation.
- Fixed bug for sequencing summary file when processing multi-FAST5 reads (Fixes #45).
- Fixed bug where scaling factors were incorrect for signal mapping output in some settings (used for basecaller training).
- Other various optimizations and minor bug fixes.
Nemo-patch.1
Add support for updated Taiyaki signal mapping interface.
Nemo
This release includes a number of new features, optimizations and performance improvements:
- Default calibration files for all released modified base models, including newly released Rerio "research" models (CpG 5mC for Minion/GridION and PromethION; CpG 5mC and 5hmC model for MinION/GridION). Sequence variant calibration files are also provided for all currently released Flip-flop models ( fixes #22 )
- New output type,
mod_mappings
, to visualize per-read modified base calls in a genome browser. This output annotates per-read reference sequence with modified basecalls including confidence scores. This output can be visualized via a genome browser using existing bisulfite setting see example here. - The modified base processing steps have now been optimized to output modified basecalls in all context more efficiently. Internal tests show that all-context modified base output can now keep up with basecalling using the Guppy backend on 2 V100 GPUs.
- Updates to make model training data preparation easier. Training a new basecalling model can be performed in two command line steps (once software is successfully installed). See documentation for this process here. New methods have been added to prepare training datasets for modified base basecalling models. These are models that specifically detect modified bases along with canonical bases. See documentation for [these new commands here)[https://nanoporetech.github.io/megalodon/modbase_training.html].
- Support for analysis of direct RNA sequencing data including the output of RNA training datasets. This does not include the release of any RNA modified base models or default calibration files.
- Megalodon helper scripts are now accessible via the command line
megalodon_extras
command. These commands have been refactored to be more user-friendly and be accessible from a standard pip or Conda installation. - The standard sequencing summary file output by Guppy is now output from Megalodon, when the
basecalls
output is selected ( fixes #24 ). - This release also includes various bug fixes and other optimizations ( fixes #28 ; fixes #33 )
Bruce
- Added support for Guppy basecalling backend (via pyguppy).
- Optimized modified base and variant processing.
- PromethION biological context model modified base support.
- Bug fixes, specifically fixed bug in modified base calling in FAST5 input mode.
Fix MANIFEST
Fix MANIFEST contexts in attempt to create conda package.
Add MANIFEST
Add manifest so LICENSE can be included in source distribution.
Sherman
- Improved sequence variant (SNP and short indels) performance
- Input Guppy FAST5 basecalls
- Direct taiyaki training file output
- Experimental RNA support
- Improved statistics aggregation compute performance
- Improved modified base aggregation algorithms
- Various feature additions
- Various bug fixes
Sherman-alpha.3
Re-structure sequence variant database schema to allow much faster sequence variant aggregation. Minor (but backwards incompatible) change to modified base schema as well. Note that --outputs snps
has been converted to --outputs variants
in this release to emphasis that megalodon calls SNPs as well as short indels.
Sherman-alpha.2
More robust handling of nearby variants as well as converting sequence variants to atomic form. Sequence variant detection (SNPs and indels) is greatly improved with this pre-release.
Sherman-alpha.1
Optimize per-read modified base statistics database for faster aggregation and generally querying of results.