WWW.DISSERTATION.XLIBX.INFO
FREE ELECTRONIC LIBRARY - Dissertations, online materials
 
<< HOME
CONTACTS



Pages:   || 2 | 3 | 4 | 5 |   ...   | 15 |

«By Ze-Qiang Ma Dissertation Submitted to the Faculty of the Graduate School of Vanderbilt University in partial fulfillment of the requirements for ...»

-- [ Page 1 ] --

ALGORITHMS FOR SHOTGUN PROTEOMICS SPECTRAL IDENTIFICATION

AND QUALITY ASSESSMENT

By

Ze-Qiang Ma

Dissertation

Submitted to the Faculty of the

Graduate School of Vanderbilt University

in partial fulfillment of the requirements

for the degree of

DOCTOR OF PHILOSOPHY

In

Biomedical Informatics

May, 2012

Nashville, Tennessee

Approved:

Professor David L. Tabb Professor Daniel C. Liebler Professor Bing Zhang Professor Kathleen L. Gould Professor Zhongming Zhao

ACKNOWLEDGMENTS

I would like to express profound gratitude to my advisor, Dr. David L. Tabb, for his invaluable support, supervision and helpful suggestions throughout all my graduate school research work. I am also grateful to my other dissertation committee members, Dr.

Daniel C. Liebler, Dr. Bing Zhang, Dr. Kathleen L. Gould and Dr. Zhongming Zhao, who were very supportive of my research and provided valuable advice on my dissertation work.

I would like to thank other members in Tabb group, particularly Dr. Surendra Dasari and our star programmer Matt Chambers for their tremendous help in my research.

I found it always fun to work with them and I learn something new every day from them.

I am also grateful to Dr. Amy-Joan L. Ham, Dr. Stacy D. Sherrod and Dr. Robbert Slebos at the Jim Ayers Institute for Precancer Detection and Diagnosis at Vanderbilt University for providing testing data sets and helpful discussions for my dissertation work.

Finally, I would like to express my gratitude to my wife Yang Wang and our lovely daughter Olivia Ma for all unconditional supports and patience. I want to thank my parents for being ever so understanding and supportive.

Thanks to NIH grants R01 CA126218 and U24 CA126479 for supporting my research work.

ii

ABBREVIATIONS

1D, 2D One-Dimensional, Two-Dimensional BSA Bovine Serum Albumin CID Collision Induced Dissociation CPTAC Clinical Proteomic Tumor Analysis Consortium Da Dalton DNA DeoxyriboNucleic Acid DTT DiThioThreitol ESI ElectroSpray Ionization ETD Electron Transfer Dissociation FDR False Discovery Rate FPR False Positive Rate FTICR Fourier Transform Ion Cyclotron Resonance GUI Graphical User Interface HCD Higher-energy Collision Dissociation HPLC High Pressure Liquid Chromatography

–  –  –

IMAC Immobilized Metal Ion Affinity Chromatography MALDI Matrix Assisted Laser Desorption and Ionization MRM Multiple Reaction Monitoring

–  –  –

NCI National Cancer Institute NGS Next Generation Sequencing NIST National Institute of Standards and Technology OMSSA Open Mass Spectrometry Search Algorithm PEP Posterior Error Probability

–  –  –

SDS-PAGE Sodium Dodecyl Sulfate PolyAcrylamide Gel Electrophoresis S/N Signal-to-Noise ratio TCGA The Cancer Genome Altas XIC Extracted Ion Chromatograms

–  –  –

ACKNOWLEDGMENTS

ABBREVIATIONS

LIST OF TABLES

LIST OF FIGURES

Chapter I. INTRODUCTION

I.1 Mass Spectrometry-Based Proteomics

I.1.1 Overview

I.1.2 Sample Preparation and Separation

I.1.3 Protein Digestion

I.1.4 Mass Spectrometry Instruments

I.1.5 Peptide Fragmentation

I.2 Proteomics Data Analysis

I.2.1 Overview

I.2.2 Peptide Identification

I.2.3 Peptide Validation

I.2.4 Protein Inference

I.3 Instrumentation Quality Control

I.4 Dissertation Outline

–  –  –

IDENTIFICATIONS VIA SPECTRAL CLUSTERING

II.1 Introduction

II.2 Algorithm

II.2.1 Overview

II.2.2 Spectral Clustering

II.2.3 Rescue of Spectral Identifications

II.2.4 Bayesian Average Score

II.3 Data Sources

II.4 Results and Discussion

–  –  –

Localization Ambiguity

II.4.2 Rescue of Spectra in Comparative Analysis

II.4.3 Rescue of Spectra in a Variety of Datasets

II.5 Conclusion

III. SCANRANKER: QUALITY ASSESSMENT OF TANDEM MASS SPECTRA

VIA SEQUENCE TAGGING

III.1 Introduction

III.2 Algorithm

III.2.1 Overview

III.2.2 BestTagScore Subscore

III.2.3 BestTagTIC Subscore

III.2.4 TagMzRange Subscore

–  –  –

III.3 Data Sources

III.4 Results and Discussion

III.4.1 Subscore Evaluation

III.4.2 Removal of Low Quality Spectra

III.4.3 Recovery of Unidentified High Quality Spectra

III.4.4 Comparison of ScanRanker to QualScore

III.4.5 Prediction of Richness of Identifiable Spectra

III.4.6 Use of Quality Score in Peptide Validation

III.4.7 Selection of Spectra for De Novo Sequencing

III.4.8 Use of ScanRanker in Cross-linking Analysis

III.5 Conclusion

IV. QUAMETER: MULTI-VENDOR PERFORMANCE METRICS FOR LCMS/MS PROTEOMICS INSTRUMENTATION

IV.1 Introduction

IV.2 Overview

IV.3 Data Sources

IV.4 Results and Discussion





IV.4.1 Differences between QuaMeter and MSQC

IV.4.2 Multi-vendor Performance

IV.4.3 Impact of identification tools

IV.5 Conclusion

–  –  –

V.1 Summary of Results

V.2 Future Direction

V.2.1 Peptide Identification

V.2.2 PTM Identification and Validation

V.2.3 Next Generation Sequencing and Proteomics

V.2.4 Integration of Omics Data

V.2.5 Targeted Proteomics

Appendix A. SOFTWARE CONFIGURATIONS

MyriMatch Configurations

Sequest Configurations

X!Tandem Configurations

PepNovo Configurations

TagRecon Configurations

Pepitome Configurations

ScanRanker Configurations

IDPicker Configurations

QuaMeter Configurations

REFERENCES

–  –  –

Table 1. Bioinformatics tools for MS-based proteomics data analysis.

Table 2. Experimental datasets for the evaluation of IDBoost.

Table 3. Experimental datasets for the evaluation of ScanRanker.

Table 4. Experimental datasets for the evaluation of QuaMeter.

–  –  –

Figure 1. The typical MS-based proteomics workflow.

Figure 2. Theoretical fragmentation of a peptide

Figure 3. Mobile proton model for peptide fragmentation.

Figure 4. The typical MS-based proteomics data analysis workflow.

Figure 5. Four peptide identification strategies.

Figure 6. Peptide identification by the database search strategy.

Figure 7. Score distribution for correct and incorrect PSMs.

Figure 8. A simplified example of protein inference.

Figure 9. A diagram of rescuing unidentified spectra in a cluster.

Figure 10. Analysis of rescued PSMs in phosphorylation studies.

Figure 11. Impact of IDBoost on recognition of differentially expressed proteins in comparative analysis.

Figure 12. IDBoost performance in a variety of datasets.

Figure 13. A screenshot of ScanRanker GUI

Figure 14. A screenshot of IonMatcher GUI.

Figure 15. Combining three subscores improves the discriminating power of ScanRanker.

Figure 16. Removing poor MS/MS scans in ScanRanker does not significantly reduce identifications.

–  –  –

Figure 18. Evaluation of ScanRanker to recover unidentified high quality spectra.

........ 79 Figure 19. Comparison of ScanRanker to QualScore.

Figure 20. ScanRanker scores predict the richness of identifiable spectra.

Figure 21. Adding ScanRanker scores in peptide validation increases the number of confident spectrum identifications.

Figure 22. ScanRanker scores can be used to predict de novo sequencing success.

........ 86 Figure 23. ScanRanker helps to prioritize spectra for manual inspection in cross-linking analysis.

Figure 24. Workflow diagram for QuaMeter operation

Figure 25. QuaMeter generates similar metrics as MSQC except several chromatographic metrics due to the use of distinct chromatogram extraction tools.

Figure 26. QuaMeter generates reliable chromatographic data in instruments from multiple vendors via the Crawdad function in ProteoWizard.

Figure 27. QuaMeter computes QC metrics for multiple instrument platforms.

............ 104 Figure 28. QuaMeter metrics help to spot abnormal instrument performance............... 106 Figure 29. Distinct identification tools produce different QC metrics with similar variation.

Figure 30. A summary of three bioinformatics tools in proteomics data analysis workflow.

–  –  –

The topic of this dissertation is the development of novel algorithms and bioinformatics tools for proteomics data analysis. This chapter provides a general introduction to the field of proteomics and the data analysis process. The following is not intended to be a complete coverage of all areas of proteomics, but rather to serve as an overview in order to provide an understanding of the work detailed in the following chapters.

–  –  –

Proteomics as a discipline can be defined as the identification and quantification of the complete set of proteins in a cell or tissue at a particular state. Although a number of alternative proteomics strategies such as protein array based methods have been developed, mass spectrometry (MS)-based proteomics has become the method of choice for large-scale studies. The applications of MS-based proteomics approaches have proved to be successful in molecular and cellular biology research including post-translational modification (PTM) identification and protein-protein interactions (Aebersold & Mann 2003). With recent improvements in instrumentation and methodology, proteomics has undergone tremendous advances over the past few years, enabling many powerful applications such as functional analysis of complex organisms (Schrimpf et al. 2009), global analysis of PTM (Witze et al. 2007), large-scale reconstruction of protein interaction networks (Gstaiger & Aebersold 2009) and introduction of proteomics in clinical and translational research (Bousquet-Dubouch et al. 2011).

–  –  –

Tandem Mass Spectra Peptide Identifications Confident Peptide List Assembled Protein List Figure 1. The typical MS-based proteomics workflow.

The typical workflow for a bottom-up MS-based proteomics experiment is illustrated in Figure 1. The first step is to reduce the complexity of a biological sample by one or several separation techniques such as SDS-PAGE and two-dimensional (2D) gel electrophoresis. Large proteins are then digested to peptides using site-specific proteases.

Next, peptide mixtures are separated by liquid chromatography and ionized in a mass spectrometer. Precursor ions with particular mass-to-charge (m/z) values are selected and collided with nonreactive gas to generate fragment ions. The corresponding m/z values and peak intensities of fragment ions are recorded in tandem mass spectra, which are interpreted to peptides by computational tools. Finally, the identified peptides are assembled into a list of proteins that are most likely present in the sample.

–  –  –

In proteomics studies, complex biological samples that contain a large number of proteins are often separated to simple mixtures prior to MS analysis. Various separation techniques can be used for this purpose. A widely used approach is to separate protein mixtures by SDS-PAGE, and then cut the gel to fractions for MS analysis. Samples of high complexity are now often fractionated by 2D-gel electrophoresis (Kenrick & Margolis 1970), which separates proteins based on their isoelectric points and molecular weights. Each spot in the gel may represent one or several purified proteins that can be further analyzed by MS. Recently a gel-based peptide-level isoelectric focusing approach (Hörth et al. 2006) has been shown to provide complementary coverage to the conventional gel-based fractionation method and yield higher identification rates (Hubner et al. 2008).

A gel-free approach known as shotgun proteomics directly analyzes large mixtures of peptides by coupling the electrospray ionization (ESI) of mass spectrometer in-line with a liquid chromatography (LC) system. Peptides are separated in the chromatography system to reduce the complexity. Two major types of LC systems are reverse phase high pressure liquid chromatography (RP-HPLC) that separates molecules by hydrophobicity and ion exchange chromatography that separates molecules by their charges. High complexity samples can be separated using the multidimensional protein identification technology (MudPIT) (Washburn et al. 2001), which consists of a two dimensional chromatography. The first dimension is usually a strong cation exchange (SCX) column with high loading capacity. Eluted samples are subsequently separated by a reverse phase chromatography.



Pages:   || 2 | 3 | 4 | 5 |   ...   | 15 |


Similar works:

«ARCHITECTURES FOR RUN-TIME VERIFICATION OF CODE INTEGRITY by MILENA MILENKOVIC A DISSERTATION Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in The Shared Computer Engineering Program of The University of Alabama in Huntsville The University of Alabama at Birmingham to The School of Graduate Studies of The University of Alabama in Huntsville HUNTSVILLE, ALABAMA 2005 In presenting this dissertation in partial fulfillment of the requirements for a...»

«Societal marketing concept and spirituality in the workplace theory: Finding the common ground 1 Anselmo Ferreira Vasconcelos Abstract This paper suggests that there exist many theoretical linkages between the societal marketing concept (SMC) and spirituality in the workplace (SWP) theory. Thus, it is reviewed the literature of both SMC and the emerging field of SWP theory in order to find unexplored commonalities between them. As a result, it acknowledges that SMC broached a new perspective in...»

«WHEN EROS MEETS AUTOS: MARRIAGE TO SOMEONE WITH AUTISM SPECTRUM DISORDER by Cathryn Rench LESLIE KORN, PhD, Faculty Mentor and Chair JOAN COMEAU, PhD, Committee Member KATHRYN B. MILLER, PhD, Committee Member Anna Hultquist, PhD, LMFT, CFLE, Dean Harold Abel School of Social and Behavioral Sciences A Dissertation Presented in Partial Fulfillment Of the Requirements for the Degree Doctor of Philosophy Capella University October 2014 UMI Number: 3681894 All rights reserved INFORMATION TO ALL...»

«Instructions for use Growth Dynamics and Applications of Selectively–Grown InGaAs Nanowires (有機金属気相選択成長法による InGaAs ナノワイヤの 成長ダイナミクスと素子応用に関する研究) A dissertation submitted in partial fulfillment of the requirement for the degree of Doctor of Philosophy (Engineering) in Hokkaido University February, 2014 by Yoshinori KOHASHI Dissertation Supervisor Professor Junichi MOTOHISA Dedicated to my parents, Etsuko KOHASHI...»

«IMAGINING THE DECOLONIAL SPIRIT: Ecowomanist Literature and Criticism in the Chinese Diaspora A DISSERATION SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL OF THE UNIVERSITY OF MINNESOTA BY Xiumei Pu IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY Edén Torres, Adviser July 2013 ©Xiumei Pu 2013 Acknowledgements My sincere thanks go to my committee. Your insightful feedback and continuous support have made my project possible. Thank you to the Department of...»

«MICHAEL P. MALLOY, Ph.D. Distinguished Professor and Scholar University of the Pacific McGeorge School of Law RÉSUMÉ Education THE GRAD UATE SCHO OL, GEORGETOW N UNIVERSITY Ph.D. (August 1983) Dissertation: Civil Authority in Medieval Philosophy: Selected Commentaries of Aquinas and Bon aventure. Honors: Pass with Distinction, Philosophical Anthropology Comprehensive Examination FED ER AL FIN AN CIA L INSTIT UT ION S EX AM INA TIO N C OU NC IL Basic International Banking (August 1981)...»

«deeper learning Plexus Institute: Attractors and Nonlinear Dynamical Systems by Jeffrey Goldstein, PhD Adelphi University The harmony of the world is made manifest in Form and Number, and the heart and soul and all the poetry of Natural Philosophy are embodied in the concept of mathematical beauty. D’Arcy WentworthThompson, On Growth and Form Tics known as nonlinear dynamical systems theory NDS, one of the indispensible con he mathematical object known as an attractor is central to the field...»

«Simulating the Effect of Microclimate on Human Behavior in Small Urban Spaces By Fung Ki LAM A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy in Department of Architecture in the Graduate Division of the University of California, Berkeley Committee in charge: Prof. Yehuda Kalay, Chair Prof. Ed Arens Prof. Peter Bosselmann Fall 2011 Abstract Simulating the Effect of Microclimate on Human Behavior in Small Urban Spaces, by Fung Ki Lam...»

«Loughborough University Institutional Repository Chemical kinetics modelling study on fuel autoignition in internal combustion engines This item was submitted to Loughborough University's Institutional Repository by the/an author.Additional Information: • A Doctoral Thesis. Submitted in partial fulllment of the requirements for the award of Doctor of Philosophy of Loughborough University. https://dspace.lboro.ac.uk/2134/6533 Metadata Record: c Zhen Liu Publisher: Please cite the published...»

«Affecting Genre: Women’s Participation with Popular Romance Fiction by Stephanie Lee Moody A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (English and Education) in the University of Michigan Doctoral Committee: Professor Anne Ruggles Gere, Co-Chair Associate Professor Megan Sweeney, Co-Chair Professor June Howard Professor Elizabeth Birr Moje © Stephanie Lee Moody Acknowledgments This project would not have been possible without...»

«International Journal of Humanities & Social Science Studies (IJHSSS) A Peer-Reviewed Bi-monthly Bi-lingual Research Journal ISSN: 2349-6959 (Online), ISSN: 2349-6711 (Print) Volume-I, Issue-III, November 2014, Page No. 193-195 Published by Scholar Publications, Karimganj, Assam, India, 788711 Website: http://www.ijhsss.com The Law of Karma and Salvation Poulami Chakraborty Lecturer in Philosophy, Hiralal Majumder Memorial College, Dakshineswer, Kolkata, India Abstract The purpose of this paper...»

«SYNAPTIC ACTIVITY AND THE FORMATION AND MAINTENANCE OF NEURONAL CIRCUITS Inauguraldissertation zur Erlangung der Würde eines Doktors der Philosophie vorgelegt den Philosophisch-Naturwissenschaftliche Fakultät der Universität Basel von Martijn Johan Louis Roelandse aus Oosterhout, die Niederlande Basel, September 2005 Genehmigt von der Philosophisch-Naturwissenschaftlichen Fakultät auf Antrag von Prof. Dr. phil. A. Matus Prof. Dr. phil. H.R. Brenner Prof. Dr. phil. M. Frotscher Basel, den...»





 
<<  HOME   |    CONTACTS
2016 www.dissertation.xlibx.info - Dissertations, online materials

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.