Learning causal models for noisy biological data mining: an application to ovarian cancer detection

Authors:
Ghim-Eng Yap;Ah-Hwee Tan;Hwee-Hwa Pang
Affiliations:
School of Computer Engineering, Nanyang Technological University, Singapore;School of Computer Engineering, Nanyang Technological University, Singapore;School of Information Systems, Singapore Management University, Singapore
Venue:
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Year:
2007

Citing 7
Cited 0

Probabilistic reasoning in intelligent systems: networks of plausible inference

Probabilistic reasoning in intelligent systems: networks of plausible inference
Probabilistic reasoning in expert systems: theory and algorithms

Probabilistic reasoning in expert systems: theory and algorithms
C4.5: programs for machine learning

C4.5: programs for machine learning
Using Bayesian networks to analyze expression data

RECOMB '00 Proceedings of the fourth annual international conference on Computational molecular biology
Bayesian Artificial Intelligence

Bayesian Artificial Intelligence
Predictive neural networks for gene expression data analysis

Neural Networks
Reproducibility of SELDI-TOF protein patterns in serum: comparing datasets from different experiments

Bioinformatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Undetected errors in the expression measurements from high-throughput DNA microarrays and protein spectroscopy could seriously affect the diagnostic reliability in disease detection. In addition to a high resilience against such errors, diagnostic models need to be more comprehensible so that a deeper understanding of the causal interactions among biological entities like genes and proteins may be possible. In this paper, we introduce a robust knowledge discovery approach that addresses these challenges. First, the causal interactions among the genes and proteins in the noisy expression data are discovered automatically through Bayesian network learning. Then, the diagnosis of a disease based on the network is performed using a novel error-handling procedure, which automatically identifies the noisy measurements and accounts for their uncertainties during diagnosis. An application to the problem of ovarian cancer detection shows that the approach effectively discovers causal interactions among cancer-specific proteins. With the proposed error-handling procedure, the network perfectly distinguishes between the cancer and normal patients.