C4.5: programs for machine learning
C4.5: programs for machine learning
Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Mining quantitative association rules in large relational tables
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Data mining using two-dimensional optimized association rules: scheme, algorithms, and visualization
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Mining optimized association rules for numeric attributes
PODS '96 Proceedings of the fifteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Exploratory mining and pruning optimizations of constrained associations rules
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Mining the most interesting rules
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining frequent patterns without candidate generation
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Communications of the ACM
Exploiting succinct constraints using FP-trees
ACM SIGKDD Explorations Newsletter
Mining Optimized Association Rules with Categorical and Numeric Attributes
ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Mining Frequent Item Sets with Convertible Constraints
Proceedings of the 17th International Conference on Data Engineering
Discovery of Association Rules in Tabular Data
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Constraint-Based Rule Mining in Large, Dense Databases
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
The class imbalance problem: A systematic study
Intelligent Data Analysis
SMOTE: synthetic minority over-sampling technique
Journal of Artificial Intelligence Research
An algorithm to mine general association rules from tabular data
IDEAL'07 Proceedings of the 8th international conference on Intelligent data engineering and automated learning
EMO'05 Proceedings of the Third international conference on Evolutionary Multi-Criterion Optimization
Interestingness measures for fixed consequent rules
IDEAL'12 Proceedings of the 13th international conference on Intelligent Data Engineering and Automated Learning
BruteSuppression: a size reduction method for Apriori rule sets
Journal of Intelligent Information Systems
Hi-index | 0.00 |
Classification rules are a convenient method of expressing regularities that exist within databases. They are particularly useful when we wish to find patterns that describe a defined class of interest, i.e. for the task of partial classification or "nugget discovery". In this paper we address the problems of finding classification rules from databases containing nominal and ordinal attributes. The number of rules that can be formulated from a database is usually potentially vast due to the effect of combinatorial explosion. This means that generating all rules in order to find the best rules (according to some stated criteria) is usually impractical and alternative strategies must be used. In this paper we present an algorithm that delivers a clearly defined set of rules, the pc'-optimal set. This set describes the interesting associations in a database but excludes many rules that are simply minor variations of other rules. The algorithm addresses the problems of combinatorial explosion and is capable of finding rules from databases comprising nominal and ordinal attributes. In order to find the pc'-optimal set efficiently, novel pruning functions are used in the search that take advantage of the properties of the pc'-optimal set. Our main contribution is a method of on-the-fly pruning based on exploiting the relationship between pc'-optimal sets and ordinal data. We show that using these methods results in a very considerable increase in efficiency allowing the discovery of useful rules from many databases.