Skip to content

Project Descriptions

Sam Payne edited this page Mar 16, 2017 · 4 revisions

Description

Informed-Proteomics is an open-source suite written in C# for top-down proteomics analysis. The software package is composed of multiple modules including data structures and algorithms for analysis of proteomics data, unit tests, and command-line tools for running LC-MS feature finding algorithm (ProMex), and a new database search algorithm (MSPathFinder). Source codes and compiled binary files are available on GitHub at https://github.com/PNNL-Comp-Mass-Spec/Informed-Proteomics. An interactive results viewer, named LcMsSpectator is also available at https://github.com/PNNL-Comp-Mass-Spec/LCMS-Spectator.

Library Modules

InformedProteomics.Backend

Base elements for processing proteomics data

  • Read/write mass-spectrometry data in different formats
  • Extract spectrum and chromatogram
  • Handle protein sequence database in Fasta format
  • Predict theoretical isotopomer envelope
  • Data structures for atom, composition, amino acid, sequence, and modification
  • Sequence graph for modification search

InformedProteomics.FeatureFinding

LC-MS feature finding algorithm (ProMex)

  • Data structures for LC-MS feature
  • Likelihood ratio scoring model
  • Align LC-MS features across multiple runs

InformedProteomics.TopDown

Top-down proteomics specific data structures and algorithms

  • Scoring models for proteoform-spectrum matches
  • Sequence tag finding algorithm
  • Sequence tag-based search algorithm for multiply cleaved proteoforms

InformedProteomics.Scoring

Generating function approach for computing statistical significance of proteoform- or peptide-spectrum matches

SAIS

External library for induced sorting based suffix array construction algorithms

  • Ge Nong, Sen Zhang & Wai Hong Chan. Two Efficient Algorithms for Linear Time Suffix Array Construction. IEEE Trans. Comput. 60, 1471–1484 (2011)

Command Line Tools

MSPathFinderT

Command line tool for database search algorithm for top-down proteomics

  • Executable MSPathFinder for top-down proteomics data

PbfGen

Command line tool for generating PBF files from LC-MS data in Thermo RAW or MZML format

ProMex

Command line tool for LC-MS feature finding