Parametric Estimation of the Local False Discovery Rate for Identifying Genetic Associations

Authors:
Ye Yang;Farnoosh A. Aghababazadeh;David R. Bickel
Affiliations:
Bank of Nova Scotia (Scotiabank), Toronto;University of Ottawa, Ottawa;University of Ottawa, Ottawa
Venue:
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Year:
2013

Citing 7
Cited 0

A mixture model approach for the analysis of microarray gene expression data

Computational Statistics & Data Analysis
A mixture model for estimating the local false discovery rate in DNA microarray analysis

Bioinformatics
Bias in the estimation of false discovery rate in microarray studies

Bioinformatics
The Minimum Description Length Principle (Adaptive Computation and Machine Learning)

The Minimum Description Length Principle (Adaptive Computation and Machine Learning)
Estimating the posterior probability that genome-wide association findings are true or false

Bioinformatics
Cluster analysis using multivariate normal mixture models to detect differential gene expression with microarray data

Computational Statistics & Data Analysis
Game-theoretic probability combination with applications to resolving conflicts between statistical methods

International Journal of Approximate Reasoning

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many genome-wide association studies have been conducted to identify single nucleotide polymorphisms (SNPs) that are associated with particular diseases or other traits. The local false discovery rate (LFDR) estimated using semiparametric models has enjoyed success in simultaneous inference. However, semiparametric LFDR estimators can be biased because they tend to overestimate the proportion of the nonassociated SNPs. We address the problem by adapting a simple parametric mixture model (PMM) and by comparing this model to the semiparametric mixture model (SMM) behind an LFDR estimator that is known to be conservatively biased. Then, we also compare the PMM with a parametric nonmixture model (PNM). In our simulation studies, we thoroughly analyze the performances of the three models under different values of $(p_{1})$, a prior probability that is approximately equal to the proportion of SNPs that are associated with the disease. When $(p_{1} 10\%)$, the PMM generally performs better than the SMM. When $(p_{1}