Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search

A Keller, AI Nesvizhskii, E Kolker… - Analytical …, 2002 - ACS Publications
Analytical chemistry, 2002ACS Publications
We present a statistical model to estimate the accuracy of peptide assignments to tandem
mass (MS/MS) spectra made by database search applications such as SEQUEST.
Employing the expectation maximization algorithm, the analysis learns to distinguish correct
from incorrect database search results, computing probabilities that peptide assignments to
spectra are correct based upon database search scores and the number of tryptic termini of
peptides. Using SEQUEST search results for spectra generated from a sample of known …
We present a statistical model to estimate the accuracy of peptide assignments to tandem mass (MS/MS) spectra made by database search applications such as SEQUEST. Employing the expectation maximization algorithm, the analysis learns to distinguish correct from incorrect database search results, computing probabilities that peptide assignments to spectra are correct based upon database search scores and the number of tryptic termini of peptides. Using SEQUEST search results for spectra generated from a sample of known protein components, we demonstrate that the computed probabilities are accurate and have high power to discriminate between correctly and incorrectly assigned peptides. This analysis makes it possible to filter large volumes of MS/MS database search results with predictable false identification error rates and can serve as a common standard by which the results of different research groups are compared.
ACS Publications