Regularized Gaussian Mixture Model based discretization for gene expression data association mining

Authors:
Ruichu Cai;Zhifeng Hao;Wen Wen;Lijuan Wang
Affiliations:
Faculty of Computer Science, Guangdong University of Technology, Guangzhou, P.R. China and State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, P.R. China;Faculty of Computer Science, Guangdong University of Technology, Guangzhou, P.R. China;Faculty of Computer Science, Guangdong University of Technology, Guangzhou, P.R. China;Faculty of Computer Science, Guangdong University of Technology, Guangzhou, P.R. China
Venue:
Applied Intelligence
Year:
2013

Citing 16
Cited 0

FUSINTER: a method for discretization of continuous attributes

International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Multivariate discretization for set mining

Knowledge and Information Systems
CAIM Discretization Algorithm

IEEE Transactions on Knowledge and Data Engineering
Khiops: A Statistical Discretization Method of Continuous Attributes

Machine Learning
Mining top-K covering rule groups for gene expression data

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Toward Unsupervised Correlation Preserving Discretization

IEEE Transactions on Knowledge and Data Engineering
Discretization Using Clustering and Rough Set Theory

ICCTA '07 Proceedings of the International Conference on Computing: Theory and Applications
Unsupervised discretization using kernel density estimation

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power

Information Sciences: an International Journal
ChiMerge: discretization of numeric attributes

AAAI'92 Proceedings of the tenth national conference on Artificial intelligence
Handling numeric attributes when comparing Bayesian network classifiers: does the discretization method matter?

Applied Intelligence
What is Unequal among the Equals? Ranking Equivalent Rules from Gene Expression Data

IEEE Transactions on Knowledge and Data Engineering
Unsupervised discretization using tree-based density estimation

PKDD'05 Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases
Entropy expressions and their estimators for multivariate distributions

IEEE Transactions on Information Theory
A Fuzzy Association Rule-Based Classification Model for High-Dimensional Problems With Genetic Rule Selection and Lateral Tuning

IEEE Transactions on Fuzzy Systems
A novel split-and-merge algorithm for hierarchical clustering of Gaussian mixture models

Applied Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Association rule has shown its usefulness in the gene expression data based disease diagnosis for its good interpretability. The large number of rules generated from the high dimensional gene expression data is one of the main challenges of its applications. In this work, we reveal that the discretization preprocessing is one of the reasons for the association rule number explosion problem. To alleviate this problem, a Regularized Gaussian Mixture Model (RGMM) is proposed to discretize the continuous gene expression data. RGMM explores both the complexity of the discretization model and the information loss of the discretization procedure, under the Minimal Description Length framework. Extensive experiments show the effectiveness of RGMM on real-life gene expression data sets.