forked from crux-toolkit/crux-toolkit.github.io
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathassign-confidence.html
147 lines (146 loc) · 13.6 KB
/
assign-confidence.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
<!DOCTYPE html>
<html>
<head>
<title>assign-confidence</title>
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<link rel="stylesheet" type="text/css" href="../styles.css">
<script type="text/javascript"
src="http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
</script>
<script type="text/javascript">
MathJax.Hub.Config({jax: ['input/TeX','output/HTML-CSS'], displayAlign: 'left'});
</script>
<script type="text/javascript">
var _gaq = _gaq || [];
_gaq.push(['_setAccount', 'UA-26136956-1']);
_gaq.push(['_trackPageview']);
(function() {
var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
})();
</script>
<script src="http://ajax.googleapis.com/ajax/libs/jquery/1.8.3/jquery.min.js"></script>
<script type="text/javascript">
// Main Menu
$( document ).ready(function() {
var pull = $('.btn');
menu = $('nav ul');
menuHeight = menu.height();
$(pull).on('click', function(e) {
e.preventDefault();
menu.slideToggle();
});
$(window).resize(function(){
var w = $(window).width();
if(w > 320 && menu.is(':hidden')) {
menu.removeAttr('style');
}
});
});
</script>
</head>
<body>
<div class="page-wrap">
<nav>
<div class="btn">
</div>
<img src="../images/crux-logo.png" id="logo"></a>
<ul id="navitems">
<li><a href="../index.html">Home</a></li>
<li><a href="../download.html">Download</a></li>
<li><a href="../fileformats.html">File Formats</a></li>
<li><a href="http://groups.google.com/group/crux-users">Contact</a></li> <!--Link to google support board-->
</ul>
</nav>
<div id="content" class="autogenerated">
<!-- START CONTENT -->
<h1>assign-confidence</h1>
<h2>Usage:</h2>
<p><code>crux assign-confidence [options] <target input>+</code></p>
<h2>Description:</h2>
<p>Given target and decoy scores, estimate a q-value for each target score. The q-value is analogous to a p-value but incorporates false discovery rate multiple testing correction. The q-value associated with a score threshold T is defined as the minimal false discovery rate (FDR) at which a score of T is deemed significant. In this setting, the q-value accounts for the fact that we are analyzing a large collection of scores. For confidence estimation afficionados, please note that this definition of "q-value" is independent of the notion of "positive FDR" as defined in (Storey <em>Annals of Statistics</em> 31:2013-2015:2003).</p><p>To estimate FDRs, <code>assign-confidence</code> uses one of two different procedures. Both require that the input contain both target and decoy scores. The default, target-decoy competition (TDC) procedure is described in this article:</p><blockquote>Josh E. Elias and Steve P. Gygi. "Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry." <em>Nature Methods</em>. 4(3):207-14, 2007.</blockquote><p>Note that <code>assign-confidence</code> implements a variant of the protocol proposed by Elias and Gygi: rather than reporting a list that contains both targets and decoys, <code>assign-confidence</code> reports only the targets. The FDR estimate is adjusted accordingly (by dividing by 2).</p><p>The alternative, <em>mix-max</em> procedure is described in this article:</p><blockquote>Uri Keich, Attila Kertesz-Farkas and William Stafford Noble. <a href="http://pubs.acs.org/doi/abs/10.1021/acs.jproteome.5b00081">"An improved false discovery rate estimation procedure for shotgun proteomics."</a> <i>Journal of Proteome Research</i>. 14(8):3148-3161, 2015.</blockquote><p>Note that the mix-max procedure requires as input calibrated scores, such as Comet E-values or p-values produced using Tide-s <code>exact-p-value</code> option.</p><p>The mix-max procedure requires that scores are reported from separate target and decoy searches. Thus, this approach is incompatible with a search that is run using the <code>--concat T</code> option to <code>tide-search</code> or the <code>--decoy_search 2</code> option to <code>comet</code>. On the other hand, the TDC procedure can take as input searches conducted in either mode (concatenated or separate). If given separate search results and asked to do TDC estimation, <code>assign-confidence</code> will carry out the target-decoy competition as part of the confidence estimation procedure.</p><p>In each case, the estimated FDRs are converted to q-values by sorting the scores then taking, for each score, the minimum of the current FDR and all of the FDRs below it in the ranked list.</p><p>A primer on multiple testing correction can be found here:</p><blockquote>William Stafford Noble. <a href="http://www.nature.com/nbt/journal/v27/n12/full/nbt1209-1135.html">"How does multiple testing correction work?"</a> <em>Nature Biotechnology</em>. 27(12):1135-1137, 2009.</blockquote>
<h2>Input:</h2>
<ul>
<li><code>target input+</code> – One or more files, each containing a collection of peptide-spectrum matches (PSMs) in <a href="../file-formats/txt-format.html">tab-delimited text</a>, <a href="http://tools.proteomecenter.org/wiki/index.php?title=Formats:pepXML">PepXML</a>, or <a href="http://www.psidev.info/mzidentml">mzIdentML</a> format. In tab-delimited text format, only the specified score column is required. However if --estimation-method is tdc, then the columns "scan" and "charge" are required, as well as "protein ID" if the search was run with concat=F. Furthermore, if the --estimation-method is specified to peptide-level is set to T, then the column "peptide" must be included, and if --sidak is set to T, then the "distinct matches/spectrum" column must be included.<br>Note that multiple files can also be provided either on the command line or using the --list-of-files option.<br>Decoys can be provided in two ways: either as a separate file or embedded within the same file as the targets. Crux will first search the given file for decoys using a prefix (specified via --decoy-prefix) on the protein name. If no decoys are found, then Crux will search for decoys in a separate file. The decoy file name is constructed from the target file name by replacing "target" with "decoy". For example, if tide-search.target.txt is provided as input, then Crux will search for a corresponding file named "tide-search.decoy.txt."<br>Note that if decoys are provided in a separate file, then assign-confidence will first carry out a target-decoy competition, identifying corresponding pairs of targets and decoys and eliminating the one with the worse score. In this case, the column/tag called "delta_cn" will be eliminated from the output.</li>
</ul>
<h2>Output:</h2>
<p>The program writes files to the folder <code>crux-output</code> by default. The name of the output folder can be set by the user using the <code>--output-dir</code> option. The following files will be created:
<ul>
<li><code>assign-confidence.target.txt</code> – a <a href="../file-formats/txt-format.html">tab-delimited text file</a> that contains the targets, sorted by score. The file will contain one new column, named "<method> q-value", where <method> is either "tdc" or "mix-max".</li>
<li><code>assign-confidence.log.txt</code> – a log file containing a copy of all messages that were printed to stderr.</li>
<li><code>assign-confidence.params.txt</code> – a file containing the name and value of all parameters/options for the current operation. Not all parameters in the file may have been used in the operation. The resulting file can be used with the --parameter-file option for other crux programs.</li>
</ul>
<h2>Options:</h2>
<ul style="list-style-type: none;">
<li class="nobullet">
<h3>assign-confidence options</h3>
<ul>
<li class="nobullet"><code>--estimation-method mix-max|tdc|peptide-level</code> – Specify the method used to estimate q-values. The mix-max procedure or target-decoy competition apply to PSMs. The peptide-level option eliminates any PSM for which there exists a better scoring PSM involving the same peptide, and then uses decoys to assign confidence estimates. Default = <code>tdc</code>.</li>
<li class="nobullet"><code>--score <string></code> – Specify the column (for tab-delimited input) or tag (for XML input) used as input to the q-value estimation procedure. If this parameter is unspecified, then the program searches for "xcorr score", "evalue" (comet), "exact p-value" score fields in this order in the input file. Default = <code><empty></code>.</li>
<li class="nobullet"><code>--sidak T|F</code> – Adjust the score using the Sidak adjustment and reports them in a new column in the output file. Note that this adjustment only makes sense if the given scores are p-values, and that it requires the presence of the "distinct matches/spectrum" feature for each PSM. Default = <code>false</code>.</li>
<li class="nobullet"><code>--combine-charge-states T|F</code> – Specify this parameter to T in order to combine charge states with peptide sequencesin peptide-centric search. Works only if estimation-method = peptide-level. Default = <code>false</code>.</li>
<li class="nobullet"><code>--combine-modified-peptides T|F</code> – Specify this parameter to T in order to treat peptides carrying different or no modifications as being the same. Works only if estimation = peptide-level. Default = <code>false</code>.</li>
</ul>
</li>
<li class="nobullet">
<h3>Input and output</h3>
<ul>
<li class="nobullet"><code>--decoy-prefix <string></code> – Specifies the prefix of the protein names that indicate a decoy. Default = <code>decoy_</code>.</li>
<li class="nobullet"><code>--verbosity <integer></code> – Specify the verbosity of the current processes. Each level prints the following messages, including all those at lower verbosity levels: 0-fatal errors, 10-non-fatal errors, 20-warnings, 30-information on the progress of execution, 40-more progress information, 50-debug info, 60-detailed debug info. Default = <code>30</code>.</li>
<li class="nobullet"><code>--parameter-file <string></code> – A file containing parameters. See the <a href="../file-formats/parameter-file.html">parameter documentation</a> page for details. Default = <code><empty></code>.</li>
<li class="nobullet"><code>--overwrite T|F</code> – Replace existing files if true or fail when trying to overwrite a file if false. Default = <code>false</code>.</li>
<li class="nobullet"><code>--output-dir <string></code> – The name of the directory where output files will be created. Default = <code>crux-output</code>.</li>
<li class="nobullet"><code>--list-of-files T|F</code> – Specify that the search results are provided as lists of files, rather than as individual files. Default = <code>false</code>.</li>
<li class="nobullet"><code>--fileroot <string></code> – The fileroot string will be added as a prefix to all output file names. Default = <code><empty></code>.</li>
</ul>
</li>
</ul>
<!-- END CONTENT -->
</div>
</div>
<footer class="site-footer">
<div id="centerfooter">
<div class="footerimportantlinks">
<img src="../images/linkicon.png" style="width:16px; height:16px"><h3>Important links</h3>
<ul>
<li><a href="../faq.html">Crux <strong>FAQ</strong></a></li>
<li><a href="../glossary.html">Glossary of terminology</a></li>
<li><a href="http://scholar.google.com/citations?hl=en&user=Rw9S1HIAAAAJ">Google Scholar profile</a></li>
<li><a href="https://sourceforge.net/projects/cruxtoolkit/">SourceForge Issue's list</a></li>
<li><a href="../release-notes.html">Release Notes</a></li>
<li><a href="https://mailman1.u.washington.edu/mailman/listinfo/crux-users" title="Receive announcements of new versions">Join the mailing list</a></li>
<li><a href="http://www.apache.org/licenses/LICENSE-2.0">Apache license</a></li>
<li><a href="http://groups.google.com/group/crux-users">Support Board</a></li>
</ul>
</div>
<div class="footerimportantlinks tutoriallinks">
<img src="../images/tutorialicon.png" style="height:16px"><h3>Tutorials</h3>
<ul>
<li><a href="../tutorials/install.html">Installation</a></li>
<li><a href="../tutorials/gettingstarted.html">Getting started with Crux</a></li>
<li><a href="../tutorials/search.html">Running a simple search using Tide and Percolator</a></li>
<li><a href="../tutorials/customizedsearch.html">Customization and search options</a></li>
<li><a href="../tutorials/spectralcounts.html">Using spectral-counts</a></li>
</ul>
</div>
<div id="footertext">
<p>
The original version of Crux was written by Chris Park and Aaron Klammer
under the supervision
of <a href="http://www.gs.washington.edu/faculty/maccoss.htm">Prof. Michael
MacCoss</a>
and <a href="http://noble.gs.washington.edu/~noble">Prof. William
Stafford Noble</a> in the Department of Genome Sciences at the
University of Washington, Seattle. Website by <a href="http://www.yuvalboss.com/">Yuval Boss</a>
<br />The complete list of contributors
can be found <a href="../contributors.html">here</a>.
<br />
<br />
Maintenance and development of Crux is funded by the <a href="https://www.nih.gov/">National Institutes of Health</a> awards R01 GM096306 and P41 GM103533.
</p>
</div>
</div>
</footer>
</body>
</html>