Mining Optimized Gain Rules for Numeric Attributes

Authors:
Sergey Brin;Rajeev Rastogi;Kyuseok Shim
Affiliations:
-;-;-
Venue:
IEEE Transactions on Knowledge and Data Engineering
Year:
2003

Citing 16
Cited 9

On the optimal binary plane partition for sets of isothetic rectangles

Information Processing Letters
Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
An effective hash-based algorithm for mining association rules

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Mining quantitative association rules in large relational tables

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Data mining using two-dimensional optimized association rules: scheme, algorithms, and visualization

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Mining optimized association rules for numeric attributes

PODS '96 Proceedings of the fifteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Mining the most interesting rules

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
On approximating rectangle tiling and packing

Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Clustering Association Rules

ICDE '97 Proceedings of the Thirteenth International Conference on Data Engineering
Mining Optimized Association Rules with Categorical and Numeric Attributes

ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Optimal Histograms with Quality Guarantees

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Discovery of Multiple-Level Association Rules from Large Databases

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
An Efficient Algorithm for Mining Association Rules in Large Databases

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Mining Generalized Association Rules

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Mining Optimized Support Rules for Numeric Attributes

ICDE '99 Proceedings of the 15th International Conference on Data Engineering

Complex Spatial Relationships

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Conceptual construction on incomplete survey data

Data & Knowledge Engineering
A data mining approach to assessing the extent of damage of missing values in survey

International Journal of Business Intelligence and Data Mining
An information-theoretic approach to quantitative association rule mining

Knowledge and Information Systems
An algorithm to mine general association rules from tabular data

Information Sciences: an International Journal
QuantMiner: a genetic algorithm for mining quantitative association rules

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Discovering and managing quantitative association rules

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
QuantMiner for mining quantitative association rules

The Journal of Machine Learning Research
Compass: A hybrid method for clinical and biobank data mining

Journal of Biomedical Informatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Association rules are useful for determining correlations between attributes of a relation and have applications in the marketing, financial, and retail sectors. Furthermore, optimized association rules are an effective way to focus on the most interesting characteristics involving certain attributes. Optimized association rules are permitted to contain uninstantiated attributes and the problem is to determine instantiations such that either the support, confidence, or gain of the rule is maximized. In this paper, we generalize the optimized gain association rule problem by permitting rules to contain disjunctions over uninstantiated numeric attributes. Our generalized association rules enable us to extract more useful information about seasonal and local patterns involving the uninstantiated attribute. For rules containing a single numeric attribute, we present an algorithm with linear complexity for computing optimized gain rules. Furthermore, we propose a bucketing technique that can result in a significant reduction in input size by coalescing contiguous values without sacrificing optimality. We also present an approximation algorithm based on dynamic programming for two numeric attributes. Using recent results on binary space partitioning trees, we show that the approximations are within a constant factor of the optimal optimized gain rules. Our experimental results with synthetic data sets for a single numeric attribute demonstrate that our algorithm scales up linearly with the attribute's domain size as well as the number of disjunctions. In addition, we show that applying our optimized rule framework to a population survey real-life data set enables us to discover interesting underlying correlations among the attributes.