WWW.DISSERTATION.XLIBX.INFO
FREE ELECTRONIC LIBRARY - Dissertations, online materials
 
<< HOME
CONTACTS



Pages:   || 2 | 3 | 4 | 5 |   ...   | 14 |

«By Tricia Ann Thornton-Wells Dissertation Submitted to the Faculty of the Graduate School of Vanderbilt University in partial fulfillment of the ...»

-- [ Page 1 ] --

CONFRONTING COMPLEXITY: A COMPREHENSIVE STATISTICAL AND

COMPUTATIONAL STRATEGY FOR IDENTIFYING THE MISSING LINK

BETWEEN GENOTYPE AND PHENOTYPE

By

Tricia Ann Thornton-Wells

Dissertation

Submitted to the Faculty of the

Graduate School of Vanderbilt University

in partial fulfillment of the requirements

for the degree of

DOCTOR OF PHILOSOPHY

in Neuroscience December, 2006 Nashville, Tennessee

Approved:

Professor Jonathan L. Haines Professor Michael P. McDonald Professor Jason H. Moore Professor Marylyn D. Ritchie Copyright © 2006 by Tricia Ann Thornton-Wells All Rights Reserved To my wonderful parents, Mary and John, who have always believed in me and have encouraged and enabled me to do whatever I have dreamed To my amazing husband, Bryce, who is infinitely supportive and is always poised to provide external motivation when I need it most To my beautiful baby boy, Greyson, who provides me with joy and a healthy perspective on life and work and To my great uncle, Morgan Freeman, and my great aunt, Sara Campisi, whose affliction with Alzheimer Disease was a primary motivating factor for this work ii

ACKNOWLEDGEMENTS

This work would not have been possible without the financial support of the Neuroscience Graduate Program (T32 MH64913), the Department of Biomedical Informatics, the National Library of Medicine Training Grant Fellowship (LMO7450or my advisor, Professor Jonathan L. Haines. I would also like to thank Marylyn Ritchie, Jason Moore and Jonathan Haines for their tremendous help with interpretation of results.

I am grateful to all of those with whom I have had the pleasure to work during this and other related projects. Each member of my Dissertation Committee has provided professional guidance and taught me a great deal about scientific research. I would especially like to thank Michael P. McDonald, Ph.D., the chairman of my committee. I would like to thank Jackie Bartlett for her assistance in learning and applying the traditional genetic analysis methods and Scott Dudek for programming the fuzzy k-modes clustering algorithm. I also like to thank fellow graduate students Will Bush, Todd Edwards, Sharon Liang, Alison Motsinger and David Reif for their support and assistance with hashing out ideas and implementing programs.

I am particularly indebted to several individuals, who have been supportive of my interdisciplinary research goals and have provided exceptional mentoring with regard to my career development. Those persons are: Jonathan L. Haines, Ph.D., Director of the Center for Human Genetics Research, Jason H. Moore, Ph.D., Director of the Computational Genetics Laboratory at Dartmouth College, and Elaine Sanders-Bush, Ph.D., Director of the Vanderbilt Brain Institute.

No one has been more important to me in the pursuit of this project than the members of my family. I would like to thank my parents, John and Mary Thornton, whose love and guidance are with me in whatever I pursue. Most importantly, I wish to thank my loving husband, Bryce, and my incredible son, Greyson, who provide continuous support and motivation. I should also acknowledge my second, and as-yetunborn son, who has in his own way provided considerable motivation for the timely completion of this project.

–  –  –

DEDICATION

ACKNOWLEDGEMENTS

LIST OF TABLES

LIST OF FIGURES

LIST OF ABBREVIATIONS

Chapter I. INTRODUCTION

II. BACKGROUND

Complex human genetic disease

Categorization and analytical approaches

Heterogeneity

Interactions

Retooling for the future

III. A COMPARISON OF CLUSTERING METHODS

Background

Methods

Data Simulation

Clustering Methods

Bayesian Classification

Hypergraph Clustering

Fuzzy k-Modes Clustering

Statistical Analysis

Comparison of Clustering Methods

Applicability to Real Data

Results

Discussion

Data Simulation

Comparison of Clustering Methods

Applicability to Real Data

ivIV. FURTHER EVALUATION OF BAYESIAN CLASSIFICATION

Background

Methods

Modification of Parameter Settings

Applicability to Real Data

Results

Discussion

V. APPLICATION OF TWO STAGE ANALYSIS APPROACH TO LATE-ONSET

ALZHEIMER DISEASE DATA

Background

Methods

Specifics of Late-Onset Alzheimer Disease Dataset

Statistical Analysis

Results

Analysis of Complete Datasets

Detection of Heterogeneity

Detection of Main Effects in Subsets of Data

Cluster 0 Results

Cluster 1 Results

Cluster 2 Results

Detection of Gene-Gene Interactions in Subsets of Data

Cluster 0 Results

Cluster 1 Results

Cluster 2 Results

Discussion

Complete Dataset Discussion

Cluster 0 Discussion

Cluster 1 Discussion

Cluster 2 Discussion

Other Discussion

VI. CONCLUSIONS AND FUTURE DIRECTIONS

Summary, Conclusions and Future Studies





Future Directions for Research

REFERENCES

–  –  –

1. Confidence Intervals around ARIHA Means by Method

2. Overall results of Chi-Square Test of Independence

3. Results of Chi-Square Test of Independence for THO Datasets

4. Bayesian Classification Parameter Settings in Simulation Studies

5. Genes Covered by Markers Genotyped in One or Both Samples

6. Markers Genotyped in Family-Based and Case-Control Samples

7. Main Effect Analysis Results for Complete Family-Based Dataset

8. Main Effect Analysis Results for Complete Case-Control Dataset

9. MDR Analysis Results for Complete Datasets

10. MDR Analysis Results for Complete Datasets, with APOE Excluded

11. Top 30 Highest-Influence Markers from Second-Round of Cluster Analysis..........94

12. Top Five Highest-Influence Markers from Second-Round of Cluster Analysis.......96

13. Distribution of Affected Individuals in Final Clustering Results

14. Predominant Genotypes for the Top Five High-Influence Markers by Cluster........98

15. Main Effect Analysis Results for Cluster 0 Family-Based Dataset................ 100-101

16. Main Effect Analysis Results for Cluster 0 Case-Control Dataset

17. Main Effect Analysis Results for Cluster 1 Family-Based Dataset

18. Main Effect Analysis Results for Cluster 1 Case-Control Dataset

19. Main Effect Analysis Results for Cluster 2 Family-Based Dataset

20. Main Effect Analysis Results for Cluster 2 Case-Control Dataset

–  –  –

22. Logistic Regression Results for Cluster 0 Family-Based Data Using Markers from Significant Two-Locus MDR Model

23. MDR Analysis Results for Cluster 1

24. Logistic Regression Results for Cluster 1 Case-Control Data Using Markers from Significant Two-Locus MDR Model

25. MDR Analysis Results for Cluster 2

26. Chromosomal Location and Linkage Analysis Results for Markers in VR22 and LRRTM3

27. Cluster Subset Results for Markers Found Significant in Complete Family-Based Dataset

28. Cluster Subset Results for Markers Found Significant in Complete Case-Control Dataset

–  –  –

1. Heterogeneity-Related Factors Complicating Analysis of Genetic Disease........... 5-6

2. Interaction-Related Factors Complicating Analysis of Genetic Disease

3. Summary of Analytical Approaches to Heterogeneity

4. Summary of Analytical Approaches to Interactions

5. Structure of Genetic Models Used for Data Simulation

6. Novel Data Simulation Algorithm

7. Genetic Model THO (Trait Heterogeneity Only)

8. Genetic Model THL (Trait Heterogeneity with Locus Heterogeneity)

9. Genetic Model THG (Trait Heterogeneity with Gene-Gene Interaction).................29

10. Genetic Model THB (Trait Heterogeneity with Both Locus Heterogeneity and GeneGene Interaction)

11. Hypothetical Clustering of a THO Dataset

12. Example of Post-Processing of Hypergraph Clustering Result

13. Example of k-Modes Clustering

14. Comparison of ARIHA Means by Method and Model

15. Percentage of Clustering Results Achieving Cluster Recovery Levels by Method..45

16. Percentage of Clustering Results Achieving Cluster Recovery Levels by Method and Model

17. False Positive Rate by Significance Level (Alpha)

18. False Negative Rate by Significance Level (Alpha)

–  –  –

20. False Negative Rate by Significance Level (Alpha), Paneled by Number of Affecteds (Sample Size)

21. Percentage of Bayesian Classification Clustering Results Achieving Cluster Recovery Levels by Number of Affecteds

22. Percentage of Bayesian Classification Clustering Results Achieving Cluster Recovery Levels by Number of Nonfunctional Loci

23. Moderate Cluster Recovery across Modified Parameter Settings

24. Good Cluster Recovery across Modified Parameter Settings

25. Excellent Cluster Recovery across Modified Parameter Settings

26. Error Rates for THG and THL Genetic Model Results

27. Permutation Testing Results at Alpha of One Percent

28. Candidate Genes for Late-Onset Alzheimer Disease

29. Family-Based Data: Percentage of Missing Genotypes by Marker

30. Case-Control Data: Percentage of Missing Genotypes by Marker

31. Family-Based Data: Percentage of Missing Genotypes by Subject

32. Case-Control Data: Percentage of Missing Genotypes by Subject

33. Linkage Disequilibrium Plot of Top 5 High-Influence Markers in Family-Based Dataset

34. Linkage Disequilibrium Plot of Top 5 High-Influence Markers in Case-Control Dataset

35. Linkage Analysis of Top 5 High-Influence Markers in LRRTM3 and Flanking Markers with HetLOD Scores 2

–  –  –

ARIHA Hubert-Arabie Adjusted Rand Index BBN Bayesian Belief Network CART Classification and Regression Trees CPM Combinatorial Partitioning Method HWE Hardy-Weinberg Equilibrium LOAD Late-Onset Alzheimer Disease MARS Multivariate Adaptive Regression Splines MDR Multifactor Dimensionality Reduction (O)MIM (Online) Mendelian Inheritance in Man OSA Ordered Subset Analysis QTL Quantitative Trait Locus RPM Restricted Partitioning Method SNP Single Nucleotide Polymorphism THO Trait Heterogeneity Only THL Trait Heterogeneity with Locus Heterogeneity THG Trait Heterogeneity with Gene-Gene Interaction THB Trait Heterogeneity with Both Locus Heterogeneity and Gene-Gene Interaction

–  –  –

Like many common diseases with a genetic basis, the etiology of late-onset Alzheimer disease (LOAD) is complex. Evidence suggests that LOAD is a heterogeneous trait with multiple susceptibility loci and possibly gene-gene interactions involved. While there are existing methods that can address specific components of this etiology, ultimately, the real power of these methods lies in our ability to marry them into a comprehensive approach to genetic analysis, so that their relative strengths and weaknesses can be balanced and a range of alternative hypotheses can be investigated.

Thus, I propose a two-stage, multi-pronged approach to the problem of genetic analysis of LOAD in which heterogeneity is first addressed by dissecting-out more homogeneous subsets of the data and then main effects and gene-gene interactions are investigated in each of these subsets.

The theoretical basis for such an approach to the analysis of complex genetic diseases is presented in Chapter II. Definitions and examples of heterogeneity and interactions that complicate genetic analysis are presented. Existing methods for detecting heterogeneity and interactions are reviewed, and gaps in methodology are discussed.

Chapter III presents a simulation study in which the performance of three clustering methods is compared in the task of uncovering trait heterogeneity in simulated data. A novel data simulation algorithm is introduced. The best of the three clustering methods—Bayesian Classification—is chosen and its applicability to real data (based on its false positive and false negative rates) is investigated.

Chapter IV details an extension of this simulation study in which the implementation of the Bayesian Classification method is modified to improve performance under a wider range of conditions realistic for genetic studies. False positive and false negative rates under these conditions are also investigated.

Chapter V presents an application of the proposed two-stage comprehensive analysis to a late-onset Alzheimer disease dataset. Analysis of heterogeneity is performed using the Bayesian Classification clustering method. Main effect analysis is performed in cluster subsets. For the case-control dataset, the Pearson chi-square test of independence is applied, and for the family-based dataset, two-point linkage analysis, the Pedigree Disequilibrium Test and the Family-Based Association Test are utilized.

Interaction analysis is performed using the Multifactor Dimensionality Reduction method. Logistic regression is used to explore the structure of predictive MDR models found significant by permutation testing. Results of these integrated analyses are interpreted, and limitations of the study design and analysis methods are discussed.

In Chapter VI, the entirety of the research comprising this dissertation is put into perspective, discussing the lessons learned and the immediate future directions for this work. New directions for future studies of neurogenetic diseases are also discussed and suggestions are made as to the focus of future research efforts, given current and forthcoming phenotyping technology, such as neuroimaging.

–  –  –



Pages:   || 2 | 3 | 4 | 5 |   ...   | 14 |


Similar works:

«Expected Firm Performance and IPO Price Formation by Bradley Eric Hendricks A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Business Administration) in the University of Michigan 2015 Doctoral committee: Professor Gregory S. Miller, Chair Professor Amy K. Dittmar Associate Professor Yusuf C. Masatlioglu Associate Professor Catherine Shakespeare Assistant Professor Christopher D. Williams © Bradley Eric Hendricks 2015 Acknowledgements...»

«A Quality-aware Cloud Selection Service for Computational Modellers Shahzad Ahmed Nizamani Submitted in accordance with the requirements for the degree of Doctorate of Philosophy The University of Leeds School of Computing July, 2012 ii The candidate confirms that the work submitted is his own, except where work which has formed part of jointly-authored publications has been included. The contribution of the candidate and the other authors to this work has been explicitly indicated below. The...»

«ENGAGING CHILDREN IN TALK ABOUT MATHEMATICS: THE EFFECTS OF AN EARLY MATHEMATICS INTERVENTION By Tracy Payne Cummings Dissertation Submitted to the Faculty of the Graduate School of Vanderbilt University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY in Department of Teaching, Learning, and Diversity August, 2011 Nashville, Tennessee Approved by: Professor Dale Farran Professor Paul Cobb Professor David Dickinson Professor Mark Lipsey Copyright © 2011 by...»

«TOUGHNESS-DOMINATED HYDRAULIC FRACTURES IN COHESIONLESS PARTICULATE MATERIALS A Dissertation Presented to The Academic Faculty by Robert S Hurt In Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy in the School of Civil and Environmental Engineering Georgia Institute of Technology May 2012 TOUGHNESS-DOMINATED HYDRAULIC FRACTURES IN COHESIONLESS PARTICULATE MATERIALS Approved by: Dr. Leonid Germanovich, Advisor Dr. Glen Rix School of Civil and Environmental School of...»

«Laser Cooling and Trapping of Neutral Calcium Atoms Ian Norris A thesis presented in partial fulfillment of the requirements for the degree of Doctor of Philosophy Department of Physics University of Strathclyde August 2009 This thesis is the result of the author’s original research. It has been composed by the author and has not been previously submitted for examination which has lead to the award of a degree. The copyright of this thesis belongs to the author under the terms of the United...»

«Accounting for Judaism in the Study of American Messianic Judaism by Patricia A. Power A Dissertation Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy Approved November 2015 by the Graduate Supervisory Committee: Joel D. Gereboff, Chair Tracy Fessenden Eugene Clay ARIZONA STATE UNIVERSITY December 2015 ABSTRACT Since its modern renaissance in the mid-1970s, the Messianic Jewish movement in America has grown from a handful of house churches to a network of...»

«Why I Choose a Vocational High School: The Study of Elicited Expectation and Educational Decision Parita Suaphan Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy under the Executive Committee of the Graduate School of Arts and Science COMLUMBIA UNIVERSITY 2015 ©2014 Parita Suaphan All rights reserved ABSTRACT Why I Choose a Vocational High School: A Study of Elicited Expectation and Educational Decision Parita Suaphan The decision between vocational...»

«From the Neverland to the Midnight Garden: The Changing Representation of Boys in Children’s Literature 53 From the Neverland to the Midnight Garden: The Changing Representation of Boys in Children’s Literature Kazuki OGAWA Introduction British children’s literature has produced many popular boy characters: Peter Pan, Winnie-the-Pooh and Harry Potter, to name but a few. The last name among them may now be the most famous boy all over the world; he is the hero in the Harry Potter series by...»

«Reply to David Buller by Martin Daly & Margo Wilson The substantial excess risk of abuse and homicide incurred by stepchildren has been abundantly documented in dozens of studies using diverse methodologies (see companion document “The Cinderella effect”). Nevertheless, philosopher David Buller (2005a,b) has recently attempted to call the existence of this phenomenon into question by proposing “that all of the evidence cited in support” of it (Buller 2005b: 282; emphasis in original)...»

«Radar Sub-surface Sensing for Mapping the Extent of Hydraulic Fractures and for Monitoring Lake Ice and Design of Some Novel Antennas by Jiangfeng Wu A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Electrical Engineering) in the University of Michigan 2016 Doctoral Committee: Professor Kamal Sarabandi, Chair Assistant Professor Brian R. Ellis Associate Professor Anthony Grbic Professor Leung Tsang © Jiangfeng Wu 2016 All Right...»

«MICROARCHITECTURE CHOICES AND TRADEOFFS FOR MAXIMIZING PROCESSING EFFICIENCY by Deborah T. Marr A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Computer Science and Engineering) in The University of Michigan Doctoral Committee: Professor Trevor N. Mudge, Chair Professor Farnam Jahanian Professor William R. Martin Associate Professor Dennis M. Sylvester © Copyright by Deborah T. Marr, 2008 All Rights Reserved DEDICATION For my husband...»

«Instructions for use Growth Dynamics and Applications of Selectively–Grown InGaAs Nanowires (有機金属気相選択成長法による InGaAs ナノワイヤの 成長ダイナミクスと素子応用に関する研究) A dissertation submitted in partial fulfillment of the requirement for the degree of Doctor of Philosophy (Engineering) in Hokkaido University February, 2014 by Yoshinori KOHASHI Dissertation Supervisor Professor Junichi MOTOHISA Dedicated to my parents, Etsuko KOHASHI...»





 
<<  HOME   |    CONTACTS
2016 www.dissertation.xlibx.info - Dissertations, online materials

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.