SparseAssembler

Exploiting sparseness in de novo genome assembly

Basic command for the software:

./SparseAssembler g 10 k 51 LD 0 GS 200000000 NodeCovTh 1 EdgeCovTh 0 f frag_1.fastq f frag_2.fastq f frag_3.fastq &

For memory usages, results and comparisons:

Parameters:

k: kmer size, support 15~127.

g: number of skipped intermediate k-mers, support 1-25.

f: single end FASTA/FASTQ reads. Multiple inputs shall be independently imported with this parameter.

GS: genome size estimation in bp (used for memory pre-allocation), suggest a large value if possible.(e.g. ~ 3x genome size)

NodeCovTh: coverage threshold for spurious k-mers, support 0-16. (default 1)

EdgeCovTh: coverage threshold for spurious links, support 0-16. (default 0)

LD: load a saved k-mer graph.

PathCovTh: coverage threshold for spurious paths in the breadth-first search, support 0-100.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
compiled		compiled
BasicDataStructure.h		BasicDataStructure.h
BuildContigs.h		BuildContigs.h
GraphConstruction.h		GraphConstruction.h
GraphSimplification.h		GraphSimplification.h
LICENSE		LICENSE
README.md		README.md
ReadsCorrection.h		ReadsCorrection.h
ReadsOperation.h		ReadsOperation.h
ScaffoldingDataStructure.h		ScaffoldingDataStructure.h
SparseAssembler.cpp		SparseAssembler.cpp

Provide feedback