Causal gene identification using combinatorial V-structure search

Authors:
Ruichu Cai;Zhenjie Zhang;Zhifeng Hao
Affiliations:
Faculty of Computer Science, Guangdong University of Technology, Guangzhou, PR China and State Key Laboratory for Novel Software Technology, Nanjing University, PR China;Advanced Digital Sciences Center, Illinois at Singapore Pte. Ltd., Singapore;Faculty of Computer Science, Guangdong University of Technology, Guangzhou, PR China
Venue:
Neural Networks
Year:
2013

Citing 15
Cited 0

Using Bayesian networks to analyze expression data

RECOMB '00 Proceedings of the fourth annual international conference on Computational molecular biology
Feature selection for high-dimensional genomic microarray data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Efficient Feature Selection via Analysis of Relevance and Redundancy

The Journal of Machine Learning Research
The max-min hill-climbing Bayesian network structure learning algorithm

Machine Learning
Causality and pathway search in microarray time series experiment

Bioinformatics
Estimating High-Dimensional Directed Acyclic Graphs with the PC-Algorithm

The Journal of Machine Learning Research
Markov blanket-embedded genetic algorithm for gene selection

Pattern Recognition
An efficient gene selection algorithm based on mutual information

Neurocomputing
Grouped graphical Granger modeling for gene expression regulatory networks discovery

Bioinformatics
Improving the Reliability of Causal Discovery from Small Data Sets Using Argumentation

The Journal of Machine Learning Research
Causality: Models, Reasoning and Inference

Causality: Models, Reasoning and Inference
Local Causal and Markov Blanket Induction for Causal Discovery and Feature Selection for Classification Part I: Algorithms and Empirical Evaluation

The Journal of Machine Learning Research
Local Causal and Markov Blanket Induction for Causal Discovery and Feature Selection for Classification Part II: Analysis and Extensions

The Journal of Machine Learning Research
BASSUM: A Bayesian semi-supervised method for classification feature selection

Pattern Recognition
What is Unequal among the Equals? Ranking Equivalent Rules from Gene Expression Data

IEEE Transactions on Knowledge and Data Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

With the advances of biomedical techniques in the last decade, the costs of human genomic sequencing and genomic activity monitoring are coming down rapidly. To support the huge genome-based business in the near future, researchers are eager to find killer applications based on human genome information. Causal gene identification is one of the most promising applications, which may help the potential patients to estimate the risk of certain genetic diseases and locate the target gene for further genetic therapy. Unfortunately, existing pattern recognition techniques, such as Bayesian networks, cannot be directly applied to find the accurate causal relationship between genes and diseases. This is mainly due to the insufficient number of samples and the extremely high dimensionality of the gene space. In this paper, we present the first practical solution to causal gene identification, utilizing a new combinatorial formulation over V-Structures commonly used in conventional Bayesian networks, by exploring the combinations of significant V-Structures. We prove the NP-hardness of the combinatorial search problem under a general settings on the significance measure on the V-Structures, and present a greedy algorithm to find sub-optimal results. Extensive experiments show that our proposal is both scalable and effective, particularly with interesting findings on the causal genes over real human genome data.