On support thresholds in associative classification

Authors:
Elena Baralis;Silvia Chiusano;Paolo Garza
Affiliations:
Corso Duca degli Abruzzi, Torino, Italy;Corso Duca degli Abruzzi, Torino, Italy;Corso Duca degli Abruzzi, Torino, Italy
Venue:
Proceedings of the 2004 ACM symposium on Applied computing
Year:
2004

Citing 11
Cited 9

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Beyond market baskets: generalizing association rules to correlations

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Efficiently mining long patterns from databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Efficient mining of association rules using closed itemset lattices

Information Systems
Generating non-redundant association rules

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Growing decision trees on support-less association rules

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
A condensed representation to find frequent patterns

PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Mining frequent patterns with counting inference

ACM SIGKDD Explorations Newsletter - Special issue on “Scalable data mining algorithms”
Free-Sets: A Condensed Representation of Boolean Data for the Approximation of Frequency Queries

Data Mining and Knowledge Discovery
CMAR: Accurate and Efficient Classification Based on Multiple Class-Association Rules

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
A Lazy Approach to Pruning Classification Rules

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining

Sequential patterns for text categorization

Intelligent Data Analysis
A review of associative classification mining

The Knowledge Engineering Review
Non-Derivable Item Set and Non-Derivable Literal Set Representations of Patterns Admitting Negation

DaWaK '09 Proceedings of the 11th International Conference on Data Warehousing and Knowledge Discovery
CIMDS: adapting postprocessing techniques of associative classification for malware detection

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
Mining associative classification rules with stock trading data - A GA-based method

Knowledge-Based Systems
Two measures of objective novelty in association rule mining

PAKDD'09 Proceedings of the 13th Pacific-Asia international conference on Knowledge discovery and data mining: new frontiers in applied data mining
Mining correlated rules for associative classification

ADMA'05 Proceedings of the First international conference on Advanced Data Mining and Applications
Mining actionable behavioral rules

Decision Support Systems
Looking for a structural characterization of the sparseness measure of (frequent closed) itemset contexts

Information Sciences: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Associative classification is a well-known technique for structured data classification. Most previous works on associative classification use support based pruning for rule extraction, and usually set the threshold value to 1%. This threshold allows rule extraction to be tractable and on the average yields a good accuracy. We believe that this threshold may be not accurate in some cases, since the class distribution in the dataset is not taken into account. In this paper we investigate the effect of support threshold on classification accuracy. Lower support thresholds are often unfeasible with current extraction algorithms, or may cause the generation of a huge rule set. To observe the effect of varying the support threshold, we first propose a compact form to encode a complete rule set. We then develop a new classifier, named L3G, based on the compact form. Taking advantage of the compact form, the classifier can be built also with rather low support rules. We ran a variety of experiments with different support thresholds on datasets from the UCI machine learning database repository. The experiments showed that the optimal accuracy is obtained for variable threshold values, sometime lower than 1%.