Mining top-K covering rule groups for gene expression data

Authors:
Gao Cong;Kian-Lee Tan;Anthony K. H. Tung;Xin Xu
Affiliations:
University of Edinburgh;National University of Singapore;National University of Singapore;National University of Singapore
Venue:
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Year:
2005

Citing 17
Cited 33

Exploratory mining and pruning optimizations of constrained associations rules

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Making large-scale support vector machine learning practical

Advances in kernel methods
Pruning and summarizing the discovered associations

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining the most interesting rules

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Mining frequent patterns by pattern-growth: methodology and implications

ACM SIGKDD Explorations Newsletter - Special issue on “Scalable data mining algorithms”
Bioinformatics: the machine learning approach

Bioinformatics: the machine learning approach
Mining Optimized Association Rules with Categorical and Numeric Attributes

ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Using transposition for pattern discovery from microarray data

DMKD '03 Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
MaPle: A Fast Algorithm for Maximal Pattern-based Clustering

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
CLOSET+: searching for the best strategies for mining frequent closed itemsets

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Carpenter: finding closed patterns in long biological datasets

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
FARMER: finding interesting rule groups in microarray datasets

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Mining coherent gene clusters from gene-sample-time microarray data

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Bagging, boosting, and C4.S

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
A general approach to mining quality pattern-based clusters from microarray data

DASFAA'05 Proceedings of the 10th international conference on Database Systems for Advanced Applications

On discovery of maximal confident rules without support pruning in microarray data

Proceedings of the 5th international workshop on Bioinformatics
CCCS: a top-down associative classifier for imbalanced class distribution

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
On Mining Instance-Centric Classification Rules

IEEE Transactions on Knowledge and Data Engineering
Mining statistically important equivalence classes and delta-discriminative emerging patterns

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
High Confidence Rule Mining for Microarray Analysis

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
CSV: visualizing and mining cohesive subgraphs

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Direct mining of discriminative and essential frequent patterns via model-based search tree

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
CARSVM: A class association rule-based classification framework and its application to gene expression data

Artificial Intelligence in Medicine
Top-down mining of frequent closed patterns from very high dimensional data

Information Sciences: an International Journal
Linguistic recognition system for identification of some possible genes mediating the development of lung adenocarcinoma

Information Fusion
Mining sequential patterns and tree patterns to detect erroneous sentences

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Interval based fuzzy systems for identification of important genes from microarray gene expression data: Application to carcinogenic development

Journal of Biomedical Informatics
Efficient itemset generator discovery over a stream sliding window

Proceedings of the 18th ACM conference on Information and knowledge management
Effectiveness of fuzzy discretization for class association rule-based classification

ISMIS'08 Proceedings of the 17th international conference on Foundations of intelligent systems
Mining characteristic relations bind to RNA secondary structures

IEEE Transactions on Information Technology in Biomedicine
Direct mining of discriminative patterns for classifying uncertain data

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Kernel based gene expression pattern discovery and its application on cancer classification

Neurocomputing
Cohesion: A concept and framework for confident association discovery with potential application in microarray mining

Applied Soft Computing
Constructing classification features using minimal predictive patterns

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Adjusting class association rules from global and local perspectives based on evolutionary computation

KSEM'10 Proceedings of the 4th international conference on Knowledge science, engineering and management
An approach for adaptive associative classification

Expert Systems with Applications: An International Journal
Summarizing frequent patterns using profiles

DASFAA'06 Proceedings of the 11th international conference on Database Systems for Advanced Applications
Effective classification by integrating support vector machine and association rule mining

IDEAL'06 Proceedings of the 7th international conference on Intelligent Data Engineering and Automated Learning
A Top-r Feature Selection Algorithm for Microarray Gene Expression Data

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Top-k interesting phrase mining in ad-hoc collections using sequence pattern indexing

Proceedings of the 15th International Conference on Extending Database Technology
An evolutionary approach to rank class association rules with feedback mechanism

Expert Systems with Applications: An International Journal
Improving classification accuracy of associative classifiers by using k-conflict-rule preservation

Proceedings of the 7th International Conference on Ubiquitous Information Management and Communication
Simultaneous gene selection and cancer classification using a hybrid group search optimizer

Proceedings of the 15th annual conference companion on Genetic and evolutionary computation
Four matroidal structures of covering and their relationships with rough sets

International Journal of Approximate Reasoning
Data Mining for Biologists

International Journal of Knowledge Discovery in Bioinformatics
Regularized Gaussian Mixture Model based discretization for gene expression data association mining

Applied Intelligence
CAR-NF: A classifier based on specific rules with high netconf

Intelligent Data Analysis
A feature selection method using improved regularized linear discriminant analysis

Machine Vision and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we propose a novel algorithm to discover the top-k covering rule groups for each row of gene expression profiles. Several experiments on real bioinformatics datasets show that the new top-k covering rule mining algorithm is orders of magnitude faster than previous association rule mining algorithms.Furthermore, we propose a new classification method RCBT. RCBT classifier is constructed from the top-k covering rule groups. The rule groups generated for building RCBT are bounded in number. This is in contrast to existing rule-based classification methods like CBA [19] which despite generating excessive number of redundant rules, is still unable to cover some training data with the discovered rules. Experiments show that the RCBT classifier can match or outperform other state-of-the-art classifiers on several benchmark gene expression datasets. In addition, the top-k covering rule groups themselves provide insights into the mechanisms responsible for diseases directly.