The discovery of association rules from tabular databases comprising nominal and ordinal attributes

Authors:
G. Richards;V. J. Rayward-Smith
Affiliations:
School of Information Systems, University of East Anglia, Norwich, NR4 7TJ, UK. Tel.: +44 (0)1603 592308/ Fax: +44 (0)1603 593344/ E-mail: {gr,vjrs}@cmp.uea.ac.uk;School of Information Systems, University of East Anglia, Norwich, NR4 7TJ, UK. Tel.: +44 (0)1603 592308/ Fax: +44 (0)1603 593344/ E-mail: {gr,vjrs}@cmp.uea.ac.uk
Venue:
Intelligent Data Analysis
Year:
2005

Citing 17
Cited 4

C4.5: programs for machine learning

C4.5: programs for machine learning
Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Mining quantitative association rules in large relational tables

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Data mining using two-dimensional optimized association rules: scheme, algorithms, and visualization

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Mining optimized association rules for numeric attributes

PODS '96 Proceedings of the fifteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Exploratory mining and pruning optimizations of constrained associations rules

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Mining the most interesting rules

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Trie memory

Communications of the ACM
Exploiting succinct constraints using FP-trees

ACM SIGKDD Explorations Newsletter
Mining Optimized Association Rules with Categorical and Numeric Attributes

ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Mining Frequent Item Sets with Convertible Constraints

Proceedings of the 17th International Conference on Data Engineering
Discovery of Association Rules in Tabular Data

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Constraint-Based Rule Mining in Large, Dense Databases

ICDE '99 Proceedings of the 15th International Conference on Data Engineering
The class imbalance problem: A systematic study

Intelligent Data Analysis
SMOTE: synthetic minority over-sampling technique

Journal of Artificial Intelligence Research

An algorithm to mine general association rules from tabular data

IDEAL'07 Proceedings of the 8th international conference on Intelligent data engineering and automated learning
Developments on a multi-objective metaheuristic (MOMH) algorithm for finding interesting sets of classification rules

EMO'05 Proceedings of the Third international conference on Evolutionary Multi-Criterion Optimization
Interestingness measures for fixed consequent rules

IDEAL'12 Proceedings of the 13th international conference on Intelligent Data Engineering and Automated Learning
BruteSuppression: a size reduction method for Apriori rule sets

Journal of Intelligent Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Classification rules are a convenient method of expressing regularities that exist within databases. They are particularly useful when we wish to find patterns that describe a defined class of interest, i.e. for the task of partial classification or "nugget discovery". In this paper we address the problems of finding classification rules from databases containing nominal and ordinal attributes. The number of rules that can be formulated from a database is usually potentially vast due to the effect of combinatorial explosion. This means that generating all rules in order to find the best rules (according to some stated criteria) is usually impractical and alternative strategies must be used. In this paper we present an algorithm that delivers a clearly defined set of rules, the pc'-optimal set. This set describes the interesting associations in a database but excludes many rules that are simply minor variations of other rules. The algorithm addresses the problems of combinatorial explosion and is capable of finding rules from databases comprising nominal and ordinal attributes. In order to find the pc'-optimal set efficiently, novel pruning functions are used in the search that take advantage of the properties of the pc'-optimal set. Our main contribution is a method of on-the-fly pruning based on exploiting the relationship between pc'-optimal sets and ordinal data. We show that using these methods results in a very considerable increase in efficiency allowing the discovery of useful rules from many databases.