Split Bregman method for large scale fused Lasso
Computational Statistics & Data Analysis
Isolasso: a lasso regression approach to RNA-seq based transcriptome assembly
RECOMB'11 Proceedings of the 15th Annual international conference on Research in computational molecular biology
Union Support Recovery in Multi-task Learning
The Journal of Machine Learning Research
Representing documents through their readers
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Hi-index | 3.85 |
Motivation: Many complex disease syndromes such as asthma consist of a large number of highly related, rather than independent, clinical phenotypes, raising a new technical challenge in identifying genetic variations associated simultaneously with correlated traits. Although a causal genetic variation may influence a group of highly correlated traits jointly, most of the previous association analyses considered each phenotype separately, or combined results from a set of single-phenotype analyses. Results: We propose a new statistical framework called graph-guided fused lasso to address this issue in a principled way. Our approach represents the dependency structure among the quantitative traits explicitly as a network, and leverages this trait network to encode structured regularizations in a multivariate regression model over the genotypes and traits, so that the genetic markers that jointly influence subgroups of highly correlated traits can be detected with high sensitivity and specificity. While most of the traditional methods examined each phenotype independently, our approach analyzes all of the traits jointly in a single statistical method to discover the genetic markers that perturb a subset of correlated triats jointly rather than a single trait. Using simulated datasets based on the HapMap consortium data and an asthma dataset, we compare the performance of our method with the single-marker analysis, and other sparse regression methods that do not use any structural information in the traits. Our results show that there is a significant advantage in detecting the true causal single nucleotide polymorphisms when we incorporate the correlation pattern in traits using our proposed methods. Availability: Software for GFlasso is available at http://www.sailing.cs.cmu.edu/gflasso.html Contact:sssykim@cs.cmu.edu; ksohn@cs.cmu.edu;