Discovering critical edge sequences in E-commerce catalogs
Proceedings of the 3rd ACM conference on Electronic Commerce
Eureka!: A Tool for Interactive Knowledge Discovery
DEXA '02 Proceedings of the 13th International Conference on Database and Expert Systems Applications
Formal logics of discovery and hypothesis formation by machine
Theoretical Computer Science
Mining Adaptive Ratio Rules from Distributed Data Sources
Data Mining and Knowledge Discovery
Beyond streams and graphs: dynamic tensor analysis
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Dissemination of compressed historical information in sensor networks
The VLDB Journal — The International Journal on Very Large Data Bases
Incremental tensor analysis: Theory and applications
ACM Transactions on Knowledge Discovery from Data (TKDD)
Measures of Ruleset Quality Capable to Represent Uncertain Validity
ECSQARU '07 Proceedings of the 9th European Conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty
Measures of ruleset quality for general rules extraction methods
International Journal of Approximate Reasoning
Automated trend analysis of proteomics data using an intelligent data mining architecture
Expert Systems with Applications: An International Journal
Re-mining positive and negative association mining results
ICDM'10 Proceedings of the 10th industrial conference on Advances in data mining: applications and theoretical aspects
Are tensor decomposition solutions unique? on the Global convergence HOSVD and parafac algorithms
PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part I
Re-mining item associations: Methodology and a case study in apparel retailing
Decision Support Systems
Hi-index | 0.00 |
Association Rule Mining algorithms operate on a data matrix (e.g., customers $\times$ products) to derive association rules [AIS93b, SA96]. We propose a new paradigm, namely, Ratio Rules, which are quantifiable in that we can measure the “goodness” of a set of discovered rules. We also propose the “guessing error” as a measure of the “goodness”, that is, the root-mean-square error of the reconstructed values of the cells of the given matrix, when we pretend that they are unknown. Another contribution is a novel method to guess missing/hidden values from the Ratio Rules that our method derives. For example, if somebody bought $10 of milk and $3 of bread, our rules can “guess” the amount spent on butter. Thus, unlike association rules, Ratio Rules can perform a variety of important tasks such as forecasting, answering “what-if” scenarios, detecting outliers, and visualizing the data. Moreover, we show that we can compute Ratio Rules in a single pass over the data set with small memory requirements (a few small matrices), in contrast to association rule mining methods which require multiple passes and/or large memory. Experiments on several real data sets (e.g., basketball and baseball statistics, biological data) demonstrate that the proposed method: (a) leads to rules that make sense; (b) can find large itemsets in binary matrices, even in the presence of noise; and (c) consistently achieves a “guessing error” of up to 5 times less than using straightforward column averages.