WWW.DISSERTATION.XLIBX.INFO
FREE ELECTRONIC LIBRARY - Dissertations, online materials
 
<< HOME
CONTACTS



Pages:   || 2 | 3 | 4 | 5 |   ...   | 11 |

«STATISTICAL METHODS FOR LOW-FREQUENCY AND RARE GENETIC VARIANTS by Clement Ma A dissertation submitted in partial fulfillment of the requirements for ...»

-- [ Page 1 ] --

STATISTICAL METHODS FOR LOW-FREQUENCY AND RARE GENETIC VARIANTS

by

Clement Ma

A dissertation submitted in partial fulfillment

of the requirements for the degree of

Doctor of Philosophy

(Biostatistics)

in the University of Michigan

Doctoral Committee:

Professor Michael L. Boehnke, Co-Chair

Research Associate Professor Laura J. Scott, Co-Chair

Professor Gonçalo Abecasis

Assistant Professor Hyun M. Kang

Assistant Professor Seunggeun Lee Professor Peter X. Song Assistant Professor Cristen J. Willer © Clement Ma 2014 Dedication To my wife, Joyce.

ii Acknowledgements I would like to express my deepest thanks and gratitude to my advisors Mike Boehnke and Laura Scott. Both of you were outstanding mentors, and encouraged me to achieve what I once thought was impossible. Without your careful guidance and mentorship, I would not

be the statistical geneticist that I am today. I also want to thank my committee members:

Gonçalo Abecasis, Hyun Min Kang, Seunggeun Lee, Peter Song, and Cristen Willer for their constructive feedback and support for my dissertation research.

I would like to thank all my colleagues from the Center for Statistical Genetics. I learned a great deal from Tom Blackwell, who was an active collaborator on my first two dissertation topics. Thanks to Sean Caron and Paul Anderson for helping me run my simulations smoothly and efficiently on the computing cluster. I want to thank Dawn Keene and Laura Baker for helping me on my faculty applications and other administrative issues.

I want to thank my many colleagues and collaborators outside the University of Michigan.

Thank you to all the GoT2D study collaborators for allowing me to use an early data freeze of the sequencing data for my dissertation research. I gained many useful insights from the regular participants of the Single Variant Group conference call. Thanks to Georg Heinze for helpful discussions regarding the Firth bias-corrected logistic regression test.

I want to thank all my Ann Arbor area friends who have made my five years here fun and memorable. Thanks to Mark Reppell and Adrian Tan who were always there, and served as groomsmen at my wedding ceremony. I would like to thank Ryan Welch, Rebecca Rothwell, Yancy Lo, Giorgio Pistis, Eleonora Porcu, Zhenzhen Zhang, Caroline Cheng, Katie Huang, Lisa Henn, Min A Jhun, Yeji Lee, Tanya Teslovich, Xueling Sim, Adam Locke, Christopher Moraes, and Stefanie Moraes for the many board game nights, dinners, and happy hours.

iii I want to thank my family, Danny, Joyce, and Winnie Ma for supporting me throughout my graduate studies. I was very fortunate that Ann Arbor is within driving distance to Toronto, so I was able to visit home frequently. Thanks to my new family, King, Susan, and Coral Wong, who have been very supportive and encouraging during my time in Michigan.

Most of all, I want to thank my wife, Joyce Wong, who supported me every step of the way.

Five years ago, she encouraged me to pursue my dream of doctoral studies, even though it would mean we would spend over four years living apart from each other. You were always there to cheer me up, listen to my fears, and share a laugh. I am so happy that you were able to join me in Ann Arbor, and watch me complete this long yet rewarding journey.

I truly could not have done this without you.

–  –  –

Dedication

Acknowledgements

List of Tables

List of Figures

List of Supplemental Figures

Abstract

Chapter 1: Introduction

Chapter 2: Recommended joint and meta-analysis strategies for case-control association testing of single low count variants

Chapter 3: Near equivalent calibration and power of joint and meta-analysis for association analysis of quantitative traits

Chapter 4: Evaluating the calibration and power of three gene-based association tests for the X chromosome

Chapter 5: Summary, discussion, and future directions

References

v List of Tables Table 3.1: Sample-sizes and untransformed HDL values for GoT2D studies and substudies

Table 4.1: Sample sizes for simulated case-control datasets

Table 4.2: Sample sizes for simulated quantitative trait datasets

Table 4.3: Type I error rates for burden, SKAT, and SKAT-O tests in binary and quantitative trait studies.

vi List of Figures Figure 2.1: Type I error rates by minor allele count (MAC) for logistic regression tests in joint and meta-analysis.

Figure 2.2: Type I error rates by case-control ratio for logistic regression tests in joint and meta-analysis.

Figure 2.3: Simulation-based power curves for joint and meta-analysis.

Figure 2.4: Joint analysis type I error rates by sample size for fixed expected minor allele count (MAC)

Figure 2.5: Logistic regression p-value distributions for fixed total minor allele count (MAC).

Figure 2.6: Comparison of score test-based meta-analysis and Firth test-based joint analysis p-values in the GoT2D study





Figure 3.1: Type I error rates of inverse-normalized and normally distributed quantitative traits (QTs) for linear regression in joint and meta-analysis.

Figure 3.2: Type I error rates of inverse-normalized quantitative traits (QTs) for linear regression in joint and meta-analysis.

Figure 3.3: Type I error rates of non-normally distributed quantitative traits (QTs) for linear regression in joint and meta-analysis.

Figure 3.4: Power of linear regression in joint and meta-analysis.

Figure 3.5: Joint and meta-analysis of high density lipoprotein (HDL) in the GoT2D study

Figure 4.1: Power for gene-based tests in case-control studies assuming all causal variants are deleterious.

Figure 4.2: Power for gene-based tests in case-control studies assuming causal variants are 50% deleterious and 50% protective.

vii Figure 4.3: Power for gene-based tests in QT studies assuming all causal variants are deleterious.

Figure 4.4: Power for gene-based tests in QT studies assuming causal variants are 50% deleterious and 50% protective.

viii List of Supplemental Figures Figure S2.1: Type I error rates by fixed expected minor allele count (MAC) for different sample sizes.

Figure S2.2: Meta-analysis type I error rates by sample size for fixed expected minor allele count (MAC)

Figure S2.3: Comparison of score and Firth test association p-values in the GoT2D study

Figure S2.4: Comparison of joint and meta-analysis p-values in the GoT2D study.

.... 32 Figure S2.5: Score test type I error rate and power with study-level minor allele count (MAC) filters.

Figure S2.6: Score test type I error rate and power curves for meta-analysis of K = 10 and 50 sub-studies

Figure S2.7: Type I error rates by minor allele count (MAC) for logistic regression tests and Fisher's exact test in joint and meta-analysis.

Figure S2.8: Type I error rates by case-control ratio for logistic regression and Fisher's exact tests in joint and meta-analysis.

Figure S2.9: Simulated power curves for joint and meta-analysis.

Figure S3.1: Type I error rates of normally distributed quantitative traits (QTs) for linear regression in joint and meta-analysis with covariates.

Figure S3.2: Type I error rates of additional non-normally distributed quantitative traits (QTs) for linear regression in joint and meta-analysis.

Figure S4.1: Complete type I error rates for the burden (BURD), SKAT, and SKAT-O tests in case-control studies.

Figure S4.2: Type I error rates based on simulated datasets with re-sampling and without re-sampling.

ix Figure S4.3: Power simulated with X-inactivation for gene-based tests in casecontrol studies assuming all causal variants are deleterious.

Figure S4.4: Power simulated with X-inactivation for gene-based tests in casecontrol studies assuming causal variants are 50% deleterious and 50% protective.

78 x

Abstract

Genetic association studies using sequencing, dense-array genotyping, or sequencing-based imputation provide the means to identify low-frequency and rare variants associated with diseases and traits, but analysis of these variants presents new statistical challenges. Single marker tests (e.g. logistic and linear regression), and methods to combine information across studies (e.g. joint and meta-analysis) may be poorly calibrated and/or of low power.

The calibration and power of aggregation tests, where multiple rare variants are analyzed jointly, have not been evaluated for variants on the X chromosome. In my dissertation, I

address three topics:

First, for case-control studies, I evaluate the calibration and power of four logistic regression tests in joint and meta-analysis for low-frequency and rare variants and demonstrate that: (a) for joint analysis, the Firth bias-corrected test is best (e.g. most powerful among well-calibrated tests); (b) for meta-analysis of balanced studies (equal numbers of cases and controls), the score test is best, but is less powerful than Firth testbased joint analysis; and (c) for meta-analysis of sufficiently unbalanced studies, all four tests can be anti-conservative, particularly the score test.

Second, for quantitative trait (QT) studies, I evaluate the calibration and power of linear

regression in joint and meta-analysis and demonstrate for normally distributed QTs that:

joint and sample-size weighted meta-analysis are equally well-calibrated and powerful for variants with expected minor allele count E[MAC]≥10; inverse-variance weighted metaanalysis is slightly anti-conservative for small-sized studies. For non-normally distributed QTs, joint and meta-analysis is equally anti-conservative for low-frequency and rare variants. Inverse-normal transformation of the QT remedies this problem, but transforming QTs of any distribution reduces power.

xi Third, for case-control and QT studies, I evaluate the calibration and power of three aggregation tests for the X chromosome: burden, SKAT, and SKAT-O. For case-control studies, tests are relatively well-calibrated across all simulation scenarios. Power is usually slightly increased when the coding scheme for male genotypes matches the underlying model, but power loss is small when the model is misspecified. Differences in male:female ratio in cases and controls have little effect on power. For QTs, calibration and power results are very similar to those for binary traits.

xii Chapter 1: Introduction Many human diseases and biological traits can be hereditary in nature [Gottlieb and Root, 1968; Kaprio et al., 1992; Silventoinen et al., 2003], but their genetic mechanisms are not fully understood. In genome-wide association studies (GWAS), we aim to identify genetic variants that cause differences in biological traits or disease risk. While many associated variants identified by GWAS are not causal, associated variants help localize genes or genomic regions that may harbor the true causal variants. Through fine-mapping and functional studies, we hope to identify the true causal variants, and better understand the biological mechanisms underlying human diseases and traits [Shea et al., 2011; Kulzer et al., 2014].

Genotype array-based common-variant GWAS have identified thousands of genetic variants associated with hundreds of different traits [Hindorff et al., 2012]. Investigators typically use case-control studies to detect disease-associated and cohort studies to detect quantitative trait (QT)-associated variants. We also often analyze QTs collected from casecontrol studies to identify variants associated with these QTs. To increase power to detect novel variants with small effect sizes in GWAS, investigators often combine samples across multiple association studies, typically using meta-analysis of summary-level association results [Scott et al., 2007], and less frequently, joint analysis of the combined individuallevel data [Schizophrenia Psychiatric Genome-Wide Association Study Consortium, 2011].

Although early genotyping arrays can only assay hundreds of thousands of common variants per individual, these variants are sufficient to tag a large proportion of the common variation in the population [International HapMap Consortium, 2005]. Since studies use different genotype arrays, only the small subset of overlapping variants can be meta-analyzed together directly. Genotype imputation using early reference panels (such as HapMap haplotypes [International HapMap Consortium, 2005]) fills in missing common genotypes with high accuracy, and allows the meta-analysis of the same dense set of genetic markers across all available samples [Marchini et al., 2007; Li et al., 2010].

Nearly all associated variants identified by GWAS are common [Hindorff et al., 2012].



Pages:   || 2 | 3 | 4 | 5 |   ...   | 11 |


Similar works:

«DIELECTRIC PHENOMENA OF OXIDES WITH FLUORITE RELATED SUPER STRUCTURES By CHRISTOPHER GEORGE TURNER A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA © 2014 Christopher Turner To my parents, friends and everyone who helped me push through the last four years ACKNOWLEDGMENTS First, I would like to thank my advisor Dr. Juan C. Nino for his support and guidance. He...»

«THE ROLE OF FLUIDS IN GEOLOGICAL PROCESSES by Tristan Azbej Dissertation submitted to the faculty of the Virginia Polytechnic Institute and State University in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Geosciences Committee Chairman: Robert J. Bodnar Committee: James Beard, Erdogan Kiran, Donald Rimstidt, Csaba Szabo, Robert Tracy August 22nd, 2006 Blacksburg, Virginia Key words: critical fluids, laser raman spectrometry, melt inclusions, ocelli THE ROLE...»

«PRODUCTS OF TOPOLOGICAL MODAL LOGICS A DISSERTATION SUBMITTED TO THE DEPARTMENT OF PHILOSOPHY AND THE COMMITTEE ON GRADUATE STUDIES OF STANFORD UNIVERSITY IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY Darko Sarenac May 2006 c Copyright by Darko Sarenac 2006 All Rights Reserved ii Products of Topological Modal Logics ILLC Dissertation Series DS-2006-08 For further information about ILLC-publications, please contact Institute for Logic, Language and Computation...»

«ABSTRACT MULTI-SCALE SCHEDULING TECHNIQUES FOR Title of dissertation: SIGNAL PROCESSING SYSTEMS Zheng Zhou, Doctor of Philosophy, 2013 Dissertation directed by: Shuvra S. Bhattacharyya (Chair/Advisor) Professor Department of Electrical and Computer Engineering, and Institute for Advanced Computer Studies Gang Qu (Co-Advisor) Associate Professor Department of Electrical and Computer Engineering A variety of hardware platforms for signal processing have emerged, from distributed systems such as...»

«THE ACQUISITION OF ARABIC LANGUAGE COMPREHENSION BY SAUDI CHILDREN ABDULRAHMAN I. AL-AKEEL NEWCASTLE UNIVERSITV LIBRARY 1kss L'2 A thesis submitted to the Department of Speech, University of Newcastle upon Tyne in fulfilment of the requirements for the award of the degree of Doctor of Philosophy. May 1998 DECLARATION The work presented in this dissertation is entirely my own work. This material has not been previously submitted to any other University for a degree. DEDICATION This thesis is...»

«COMPOSITE SERVICE DISCOVERY, DESCRIPTION AND INVOCATION By QIANHUI ALTHEA LIANG A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA Copyright 2004 by QIANHUI ALTHEA LIANG To My Mother and Father ACKNOWLEDGMENTS First and foremost, I would like to express my deepest gratitude to my advisor, Dr. Stanley Su, for giving me the opportunity to work with him and to study...»

«The Impact of Newsroom Philosophy on Story Ideation and Story Narration By Lee B. Becker Tudor Vlad Amy Jo Coffey Lisa Hebert Nancy Nusser Noah Arceneaux James M. Cox Jr. Center for International Mass Communication Training and Research Grady College of Journalism and Mass Communication University of Georgia Athens, GA 30602 Contact: lbbecker@uga.edu tel. 706 542-5023 Presented to the Midwest Association for Public Opinion Research, November 19-20, 2004, Chicago. The Impact of Newsroom...»

«God's Mobile Mansions: Protestant Church Relocation and Extension in Montreal, 1850-1914 Rosalyn Trigger Department of Geography McGill University, Montreal A thesis submitted to McGill University in partial fulfilment of the requirements of the degree of Doctor of Philosophy August 2004 © Rosalyn Trigger 2004 Library and Bibliothèque et 1+1 Archives Canada Archives Canada Published Heritage Direction du Branch Patrimoine de l'édition 395 Wellington Street 395, rue Wellington Ottawa ON K1A...»

«Job Satisfaction Among Professional Middle School Counselors in Virginia By Tara Yost Bane Dissertation submitted to the Faculty of the Virginia Polytechnic Institute and State University in partial fulfillment of the requirement for the degree of Doctor of Philosophy In Counselor Education Dr. Nancy Bodenhorn, Chair Dr. Gerard Lawson Dr. Penny Burge Dr. Christina Mathai October 3rd, 2006 Blacksburg, VA Keywords: Middle school counseling, Job satisfaction, Role conflict Copyright 2006, Tara Y....»

«GROWTH MODEL, SYNTHESIS OF CARBON NANOSTRUCTURES AND ALTERATION OF SURFACE PROPERTIES USING THEM Sayangdev Naha PhD Dissertation Submitted to the Faculty of Virginia Polytechnic Institute and State University in partial fulfillment of the requirements of the degree of Doctor of Philosophy in Engineering Mechanics Committee Members: Dr. Ishwar K. Puri, Chair Dr. Mark S. Cramer Dr. Muhammad R. Hajj Dr. John J. Lesko Dr. Roop L. Mahajan Dr. Mayuresh J. Patil 07/25/2008 Blacksburg, Virginia...»

«Press and National Integration: Analysis of the Role of the Nigerian Press in the Promotion of Nigerian National Identity Eghosa Aimufua School of Journalism, Media and Cultural Studies Cardiff University This thesis is submitted to Cardiff University in fulfilment o f the requirements for the Degree o f Doctor o f Philosophy June 2007 UMI Number: U584258 All rights reserved INFORMATION TO ALL USERS The quality of this reproduction is dependent upon the quality of the copy submitted. In the...»

«EFFECTS OF SURFACE CHEMISTRY AND SIZE ON IRON OXIDE NANOPARTICLE DELIVERY OF OLIGONUCLEOTIDES A Dissertation Presented to The Academic Faculty by Christopher Shen In Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the School of Biomedical Engineering Georgia Institute of Technology May 2011 EFFECTS OF SURFACE CHEMISTRY AND SIZE ON IRON OXIDE NANOPARTICLE DELIVERY OF OLIGONUCLEOTIDES Approved by: Dr. Shuming Nie, Advisor Dr. Philip Santangelo School of Biomedical...»





 
<<  HOME   |    CONTACTS
2016 www.dissertation.xlibx.info - Dissertations, online materials

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.