Advertisement
Commentary Free access | 10.1172/JCI27467
Center for Applied Proteomics and Molecular Medicine, College of Arts and Sciences, George Mason University, Manassas, Virginia, USA.
Address correspondence to: Lance A. Liotta or Emanuel F. Petricoin, Center for Applied Proteomics and Molecular Medicine, College of Arts and Sciences, George Mason University, 10900 University Boulevard, MS 4E3, Room 181A, Manassas, Virginia 20110, USA. Phone: (703) 993-9444; Fax: (703) 993-4288; E-mail: lliotta@gmu.edu (L.A. Liotta). Phone: (703) 993-8646; Fax: (703) 993-4288; E-mail: epetrico@gmu.edu (E.F. Petricoin).
Find articles by Liotta, L. in: JCI | PubMed | Google Scholar
Center for Applied Proteomics and Molecular Medicine, College of Arts and Sciences, George Mason University, Manassas, Virginia, USA.
Address correspondence to: Lance A. Liotta or Emanuel F. Petricoin, Center for Applied Proteomics and Molecular Medicine, College of Arts and Sciences, George Mason University, 10900 University Boulevard, MS 4E3, Room 181A, Manassas, Virginia 20110, USA. Phone: (703) 993-9444; Fax: (703) 993-4288; E-mail: lliotta@gmu.edu (L.A. Liotta). Phone: (703) 993-8646; Fax: (703) 993-4288; E-mail: epetrico@gmu.edu (E.F. Petricoin).
Find articles by Petricoin, E. in: JCI | PubMed | Google Scholar
Published January 4, 2006 - More info
Recent studies have established distinctive serum polypeptide patterns through mass spectrometry (MS) that reportedly correlate with clinically relevant outcomes. Wider acceptance of these signatures as valid biomarkers for disease may follow sequence characterization of the components and elucidation of the mechanisms by which they are generated. Using a highly optimized peptide extraction and matrix-assisted laser desorption/ionization–time-of-flight (MALDI-TOF) MS–based approach, we now show that a limited subset of serum peptides (a signature) provides accurate class discrimination between patients with 3 types of solid tumors and controls without cancer. Targeted sequence identification of 61 signature peptides revealed that they fall into several tight clusters and that most are generated by exopeptidase activities that confer cancer type–specific differences superimposed on the proteolytic events of the ex vivo coagulation and complement degradation pathways. This small but robust set of marker peptides then enabled highly accurate class prediction for an external validation set of prostate cancer samples. In sum, this study provides a direct link between peptide marker profiles of disease and differential protease activity, and the patterns we describe may have clinical utility as surrogate markers for detection and classification of cancer. Our findings also have important implications for future peptide biomarker discovery efforts.
Josep Villanueva, David R. Shaffer, John Philip, Carlos A. Chaparro, Hediye Erdjument-Bromage, Adam B. Olshen, Martin Fleisher, Hans Lilja, Edi Brogi, Jeff Boyd, Marta Sanchez-Carbayo, Eric C. Holland, Carlos Cordon-Cardo, Howard I. Scher, Paul Tempst
The low molecular weight region of the serum peptidome contains protein fragments derived from 2 sources: (a) high-abundance endogenous circulating proteins and (b) cell and tissue proteins. While some researchers have dismissed the serum peptidome as biological trash, recent work using mass spectrometry–based (MS-based) profiling has indicated that the peptidome may reflect biological events and contain diagnostic biomarkers. In this issue of the JCI, Villanueva et al. report on MS-based peptide profiling of serum samples from patients with advanced prostate, bladder, or breast cancer as well as from healthy controls. Surprisingly, the peptides identified as cancer-type–specific markers proved to be products of enzymatic breakdown generated after patient blood collection. The impact of these results on cancer biomarker discovery efforts is significant because it is widely believed that proteolysis occurring ex vivo should be suppressed because it destroys endogenous biomarkers. Villanueva et al. now suggest that this suppression may in fact be preventing biomarker generation.
Despite the tremendous urgency to identify clinically useful biomarkers for early disease detection, there have been only a few recent examples of such analytes that have had any real impact at the bedside (1, 2). Many scientists have pointed to what they perceive to be a dried-up blood-borne cancer biomarker pipeline for disease detection since recent searches for a single, cancer-specific marker have not proved fruitful. In response to this challenge, investigators in the field of proteomics have shifted their focus in an effort to utilize experimental methods such as mass spectrometry (MS), which does not require knowledge of a protein’s amino acid sequence prior to effective detection of the analyte. These MS-based methods offer new approaches whereby signatures of multiple analytes measured simultaneously comprise the diagnostic classifier (3–10). MS analysis of blood proteome is proving facile at probing and profiling proteomic information that may encompass hundreds of candidate disease biomarkers without the need for a priori knowledge of their existence or relevance to disease states (4–10).
Within this field of research, interest continues to grow regarding a previously unexplored reservoir — the array of existing proteins in a patient’s serum (coined as the serum proteome), particularly those of low molecular weight (LMW), as well as the metabolic products of these serum proteins (the serum peptidome, fragmentome, or degradome) (11, 12). Prior efforts in the search for serum and plasma protein biomarkers utilized gel-based separation technologies, which cannot readily separate and distinguish molecules of less than 10 kDa in size. In contrast, MS is particularly well suited for the detection of molecules within the LMW range of analysis (<20 kDa). In recognition of this attribute, investigators began to use MS to explore the LMW component of the circulatory proteome in order to determine whether the LMW pool contained any disease-related biomarker candidates. This method was first applied to examine the sera of patients with ovarian cancer (4) and then later for other cancers (5–9) and nonneoplastic diseases (10). Early research studies revealed an apparent abundance of LMW proteins and peptides that potentially contain disease-specific information and showed that changes in the expression patterns of these molecules may be disease specific. However, these early signatures, derived from MS analysis of LMW serum proteins, comprised unidentified ions (13, 14). Subsequent efforts to sequence these disease-specific ions revealed they were fragments derived from larger parent molecules that are normally too large to passively diffuse through the endothelium into the circulation and generally fall into 1 of 2 general categories: (a) fragments of endogenous, high-abundance proteins, such as transthyretin (15), and (b) fragments of low-abundance cellular and tissue proteins, such as breast cancer 2, early onset (BRCA2) (16, 17). However, despite the potential richness of information contained within the serum peptidome, there remained cause for concern. Once sequenced, some putative disease marker ions were identified as fragments of highly abundant blood components found in healthy individuals. This led researchers to question the disease specificity of this repertoire of protein fragments (13).
While some scientists have dismissed the LMW serum peptidome as noise, biological trash, or nonspecific epiphenomena (13, 14) too small to be biologically relevant, others have proposed that just the opposite is the case, that the LMW serum peptidome may contain a rich, untapped source of disease-specific diagnostic information (11). In this issue of the JCI, Villanueva et al. (18) provide evidence to this effect by utilizing MS-based serum peptidome profiling in order to identify qualitative and quantitative patterns or signatures that can indicate the presence or absence of specific types of cancer. The authors employed an automated peptide extraction technique utilizing magnetic, reverse-phase beads for analyte capture from subject sera coupled with matrix-assisted laser desorption/ionization–time-of-flight (MALDI-TOF) MS to generate a peptide signature that could classify patients with advanced prostate, breast, or bladder cancer and differentiate them from healthy controls. While these investigators did not employ noncancer inflammatory disease controls within their study, the results support the robustness of their disease-specific peptide signature since the set of marker peptides enabled highly accurate cancer class prediction for an independent validation set of prostate cancer samples.
Sequence identification of the 61 marker peptide peaks that provided the greatest degree of cancer class separation as determined by statistical significance revealed that most of the cancer-type–specific biomarker fragments were generated in patient serum by enzymatic cleavage at previously known endoprotease cleavage sites after the blood sample was collected from the patient (18). Consequently, Villanueva et al. propose that the LMW biomarkers that they found in this study are not expressed directly by the diseased tissue but are in fact generated ex vivo by proteinase-mediated enzymatic cleavage as part of the coagulation and complement activation pathways. The authors explain that fragments of endogenous blood proteins generated ex vivo serve as a substrate pool for disease-specific proteinases that arise from the tumor itself or within the tumor microenvironment. The specific substrates cleaved by the proteinases are themselves degradation products of the clotting cascade. The authors hypothesize that the cancer-type–specific signatures they detect within the LMW portion of the serum peptidome are an indirect snapshot of the enzyme activity of tumor cells. Therefore, in the authors’ view, the resultant peptide signatures are composed of what could be considered as surrogate markers for the detection and classification of certain types of cancer.
The conclusions that may be drawn from the Villanueva et al. study (18) have potentially significant implications for the field of biomarker research and commercial clinical diagnostics. The authors state that it appears that a large part of the human serum peptidome, as detected by their bead-mediated analyte capture/MADLI-TOF MS approach, is produced ex vivo by degradation of endogenous substrates by endogenous proteinases. Since the diagnostic signatures are produced by circulating, disease-specific proteinases that act on their precursor peptide substrates after the blood has been removed from the patient, proteolytic degradation occurring after serum harvesting and after blood clotting is necessary for the production of at least 1 class of LMW biomarkers and should not be suppressed by the addition of proteinase inhibitors. This recommendation is exactly the opposite of that of many scientists who advocate the immediate addition of protease inhibitors during the blood collection process to specifically inhibit protease activity believed to contaminate the biomarker archive. The relative merit of serum versus plasma as the diagnostic fluid of choice is hotly debated in the world of diagnostic biomarkers. The results reported here by Villanueva et al. suggest that, in contrast to serum collection, plasma collection (which suppresses the clotting cascade) will mask the presence of the disease-associated proteinases that ultimately act on the exogenous fragment pool and produce the MALDI-TOF diagnostic signatures. Consequently, the authors suggest that serum is inherently superior to plasma as a source of diagnostic information contained within the peptidome.
While Villanueva et al. (18) provide evidence to demonstrate one method by which the LMW peptidome is created ex vivo, there is a growing body of evidence that circulating blood already contains an abundance of protein fragments apparently derived from cells and tissues that are produced in vivo. Previous work by Tirumalai et al. (12) and most recently by Lowenthal et al. (16) has revealed a vast and diverse source of LMW and low-abundance fragments of cellular proteins that are not cleavage products of resident serum proteins. This view of the LMW peptidome is one in which low-abundance peptide biomarkers produced from specific and ongoing tumorigenic processes such as apoptosis, tumor-stromal interaction, vascularization, immune cell infiltration, and antigenic processing exist in a sequestered state, complexed to highly abundant resident blood proteins such as albumin (12, 16, 19) (Figure 1). The concentration of any LMW analyte in the circulation is completely dependant on its production and clearance rates. Given the ability of the glomeruli to efficiently and effectively remove molecules smaller than approximately 50 kDa, any LMW molecule generated in vivo would be cleared rather quickly, thereby reducing the concentration of the analyte at the time the blood is drawn to potentially undetectable levels. Sequestration of peptides by resident blood carrier proteins with long half-lives, such as albumin, protects the peptides from clearance. Two recent studies offer a first glimpse into the diagnostic potential of the LMW, carrier protein–bound biomarker pool. Lopez et al. report MALDI-TOF fingerprinting of serum for Alzheimer disease detection (20). They harvested albumin-bound LMW protein fragments as the direct input for their MS ion fingerprint analyses. Alzheimer disease–associated fingerprints found in a training set were validated using an independent test set. The approach of Lopez et al. (20) was distinct from that used by Villeneuva et al. (18). A second study investigated the identities of LMW, albumin-bound fragments obtained from the sera of ovarian cancer patients (16). The authors identified a large number of ovarian cancer–associated peptides that were different from the peptides found in high-risk, disease-free subjects. A subset of peptides was found only in the samples from patients with stage 1 ovarian cancer. Sequencing of these peptides indicated that they were fragments of low-abundance molecules such as BRCA2, tyrosine kinases, other signaling molecules, and intracellular scaffolding proteins. Thus, the endogenous circulating peptidome may be potentially redefined as a subset of the blood interactome — protein analytes that exist in the circulation as protein-protein complexes. Identification of the components of the LMW circulatory peptidome by MS sequencing provides a list of specific analytes that is independent of the measurement technology (e.g., MALDI-TOF). Fingerprints of MS ions can therefore be replaced with panels of named protein biomarkers. Regardless of whether the peptidome is derived from the tissue/microenvironment in vivo or ex vivo, the implications for the diagnostics arena are enormous.
Proteinases generate biomarker fragments. Circulating protein fragments generated in the diseased tissue microenviroment may serve as diagnostic protein markers. Proteolytic cascades within the tissue (a product of the interacting cellular ecology such as stromal-epithelial interactions), immune cell MHC presentation, or apoptosis generate protein fragments that passively diffuse into the circulation. Shed LMW peptides are protected from kidney-mediated clearance by sequestration on abundant resident blood proteins such as albumin. According to the results presented by Villanueva et al. (18) in this issue of the JCI, diagnostic protein fragments can also be generated ex vivo by circulating enzymes derived from the diseased tissue microenvironment acting on exogenously derived peptides produced by serum collection methodology (see Figure 10 in ref. 18).
Conventional immunoassay platforms cannot measure panels of analyte fragments. This is because immunoassays, by their very definition, rely on antibody-based capture and detection methods. An antibody-based assay cannot distinguish the parent molecule from its cleaved fragments (the latter of which could possess the greatest diagnostic potential) since the antibody recognizes its cognate epitope in both the parent and fragment molecules. Thus, the future of fragment-based diagnostics will require the invention and adoption of wholly new technologies that read out both the identity and the exact size of the molecule. Immuno-MS is 1 example of how this could be achieved (Figure 2). With this approach, a microaffinity antibody column, perhaps in a multiplexed microwell format, would first be used to capture the parent protein along with any of its fragment isoforms that contain the antibody recognition site. Next, the captured fragments would be eluted from the antibody column directly into a mass spectrometer (such as a MALDI-TOF MS). MS analysis of the eluted peptides would provide an extremely accurate mass determination of the entire population of captured peptides. Thus, in only 2 steps, a panel of peptide fragments derived from a known parent molecule could be rapidly tabulated. Future rounds of investigation into the nature and diagnostic potential of the serum peptidome will likely uncover many more surprises. Nevertheless, based on results to date, we now expect that the diagnostic potential of the LMW serum peptidome will likely be dependent on the examination of the following: (a) low-abundance, tissue-derived peptides, many of which avoid clearance by binding to carrier proteins; (b) size- and cleavage-specific fingerprints of fragments derived from cellular parent molecules; and (c) now, based on the findings of Villanueva et al. (18), additional signatures of cleavage products produced ex vivo after blood clotting during serum collection. Far from drying up, the LMW blood-borne biomarker pipeline is surging with potential. Nevertheless, this potential will never be realized unless commonly adopted and specific blood collection protocols are implemented across clinical and research laboratories, serum and plasma reference sets are developed and standardized, instrumentation for measuring panels of specific fragments are proven to be both sensitive and reproducible, and extensive validation of the diagnostic utility of these biomarkers is examined in clinical trials conducted in accordance with regulatory guidelines developed by the College of American Pathologists and Clinical Laboratory Improvement Amendments.
Immuno-MS provides a means for rapidly determining the specific size and identity of each member of a panel of peptide marker fragments present within patient sera. As part of a high-throughput assay performed in clinical diagnostic laboratories, patient serum would be applied to a multiplexed plate of microcolumns filled with antibody-immobilized beads. Each microcolumn captures both the parental and fragment isoforms of the candidate marker since they both contain the antibody recognition site. The captured population of analytes, including the fragment(s) with potential for disease detection and/or discrimination, are eluted and analyzed directly by MALDI-TOF MS. The presence of the specific peptide biomarker at its precise mass/charge ratio (m/z) would be used as a diagnostic test result. This analysis could be performed rapidly by simple software that determines if the ion peak is present or absent at the predefined m/z location.
Nonstandard abbreviations used: BRCA2, breast cancer 2, early onset; LMW, low molecular weight; MALDI-TOF, matrix-assisted laser desorption/ionization–time-of-flight; MS, mass spectrometric, mass spectrometry.
Conflict of interest: The authors have no personal conflicts, financial or otherwise. They have US government–assigned and –owned patent applications that cover aspects of serum biomarker discovery and amplification discussed in this manuscript. As former government employees, if the US government licenses these patents the authors are entitled to receive a share of the royalties.
Reference information: J. Clin. Invest.116:22–26 (2006). doi:10.1172/JCI27467.
See the related article beginning on page 271.