Corpus processing for lexical acquisition
A new and versatile method for association generation
Information Systems
Automatic personalization based on Web usage mining
Communications of the ACM
Small is beautiful: discovering the minimal set of unexpected patterns
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Evaluating the novelty of text-mined rules using lexical knowledge
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Scoring the Data Using Association Rules
Applied Intelligence
What Makes Patterns Interesting in Knowledge Discovery Systems
IEEE Transactions on Knowledge and Data Engineering
Simple association rules (SAR) and the SAR-based rule discovery
Computers and Industrial Engineering
Mining Surprising Patterns Using Temporal Description Length
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Interestingness of Discovered Association Rules in Terms of Neighborhood-Based Unexpectedness
PAKDD '98 Proceedings of the Second Pacific-Asia Conference on Research and Development in Knowledge Discovery and Data Mining
A Random Walk through Human Associations
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Correlation-based interestingness measure for video semantic concept detection
IRI'09 Proceedings of the 10th IEEE international conference on Information Reuse & Integration
Context Based Positive and Negative Spatio-Temporal Association Rule Mining
Knowledge-Based Systems
BruteSuppression: a size reduction method for Apriori rule sets
Journal of Intelligent Information Systems
Hi-index | 0.02 |
This article presents a new interestingness measure for association rules called confidence gain (CG). Focus is given to extraction of human associations rather than associations between market products. There are two main differences between the two (human and market associations). The first difference is the strong asymmetry of human associations (e.g., the association “shampoo” → “hair” is much stronger than “hair” → “shampoo”), where in market products asymmetry is less intuitive and less evident. The second is the background knowledge humans employ when presented with a stimulus (input phrase).CG calculates the local confidence of a given term compared to its average confidence throughout a given database. CG is found to outperform several association measures since it captures both the asymmetric notion of an association (as in the confidence measure) while adding the comparison to an expected confidence (as in the lift measure). The use of average confidence introduces the “background knowledge” notion into the CG measure.Various experiments have shown that CG and local confidence gain (a low-complexity version of CG) successfully generate association rules when compared to human free associations. The experiments include a large-scale “free sssociation Turing test” where human free associations were compared to associations generated by the CG and other association measures. Rules discovered by CG were found to be significantly better than those discovered by other measures.CG can be used for many purposes, such as personalization, sense disambiguation, query expansion, and improving classification performance of small item sets within large databases.Although CG was found to be useful for Internet data retrieval, results can be easily used over any type of database.