CARSVM: A class association rule-based classification framework and its application to gene expression data

Authors:
Keivan Kianmehr;Reda Alhajj
Affiliations:
BIDEALS Group, Department of Computer Science, University of Calgary, 2500 University Drive NW, Calgary, Alberta, Canada T2N 1N4;BIDEALS Group, Department of Computer Science, University of Calgary, 2500 University Drive NW, Calgary, Alberta, Canada T2N 1N4
Venue:
Artificial Intelligence in Medicine
Year:
2008

Citing 12
Cited 1

Gene Selection for Cancer Classification using Support Vector Machines

Machine Learning
Bioinformatics Adventures in Database Research

ICDT '03 Proceedings of the 9th International Conference on Database Theory
CMAR: Accurate and Efficient Classification Based on Multiple Class-Association Rules

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Carpenter: finding closed patterns in long biological datasets

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Microarray data mining: facing the challenges

ACM SIGKDD Explorations Newsletter
FARMER: finding interesting rule groups in microarray datasets

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Mining top-K covering rule groups for gene expression data

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
The effect of threshold values on association rule based classification accuracy

Data & Knowledge Engineering
Finding association rules that trade support optimally against confidence

Intelligent Data Analysis
Support vector machine approach for fast classification

DaWaK'06 Proceedings of the 8th international conference on Data Warehousing and Knowledge Discovery
Effective classification by integrating support vector machine and association rule mining

IDEAL'06 Proceedings of the 7th international conference on Intelligent Data Engineering and Automated Learning

A fuzzy intelligent approach to the classification problem in gene expression data analysis

Knowledge-Based Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Objective: In this study, we aim at building a classification framework, namely the CARSVM model, which integrates association rule mining and support vector machine (SVM). The goal is to benefit from advantages of both, the discriminative knowledge represented by class association rules and the classification power of the SVM algorithm, to construct an efficient and accurate classifier model that improves the interpretability problem of SVM as a traditional machine learning technique and overcomes the efficiency issues of associative classification algorithms. Method: In our proposed framework: instead of using the original training set, a set of rule-based feature vectors, which are generated based on the discriminative ability of class association rules over the training samples, are presented to the learning component of the SVM algorithm. We show that rule-based feature vectors present a high-qualified source of discrimination knowledge that can impact substantially the prediction power of SVM and associative classification techniques. They provide users with more conveniences in terms of understandability and interpretability as well. Results: We have used four datasets from UCI ML repository to evaluate the performance of the developed system in comparison with five well-known existing classification methods. Because of the importance and popularity of gene expression analysis as real world application of the classification model, we present an extension of CARSVM combined with feature selection to be applied to gene expression data. Then, we describe how this combination will provide biologists with an efficient and understandable classifier model. The reported test results and their biological interpretation demonstrate the applicability, efficiency and effectiveness of the proposed model. Conclusion: From the results, it can be concluded that a considerable increase in classification accuracy can be obtained when the rule-based feature vectors are integrated in the learning process of the SVM algorithm. In the context of applicability, according to the results obtained from gene expression analysis, we can conclude that the CARSVM system can be utilized in a variety of real world applications with some adjustments.