\delta-Tolerance Closed Frequent Itemsets

Authors:
James Cheng;Yiping Ke;Wilfred Ng
Affiliations:
The Hong Kong University of Science and Technology, Hong Kong;The Hong Kong University of Science and Technology, Hong Kong;The Hong Kong University of Science and Technology, Hong Kong
Venue:
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Year:
2006

Citing 0
Cited 17

Fg-index: towards verification-free query processing on graph databases

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Efficient mining of understandable patterns from multivariate interval time series

Data Mining and Knowledge Discovery
A survey on algorithms for mining frequent itemsets over data streams

Knowledge and Information Systems
Constraint programming for itemset mining

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient query processing on graph databases

ACM Transactions on Database Systems (TODS)
CP-summary: a concise representation for browsing frequent itemsets

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Sweeping the disjunctive search space towards mining new exact concise representations of frequent itemsets

Data & Knowledge Engineering
Finding Top-N Pseudo Formal Concepts with Core Intents

MLDM '09 Proceedings of the 6th International Conference on Machine Learning and Data Mining in Pattern Recognition
Self-sufficient itemsets: An approach to screening potentially interesting associations between items

ACM Transactions on Knowledge Discovery from Data (TKDD)
Actionability and formal concepts: a data mining perspective

ICFCA'08 Proceedings of the 6th international conference on Formal concept analysis
JPMiner: mining frequent jump patterns from graph databases

FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 5
Margin-closed frequent sequential pattern mining

Proceedings of the ACM SIGKDD Workshop on Useful Patterns
ReDRIVE: result-driven database exploration through recommendations

Proceedings of the 20th ACM international conference on Information and knowledge management
YmalDB: a result-driven recommendation system for databases

Proceedings of the 16th International Conference on Extending Database Technology
A prediction framework based on contextual data to support Mobile Personalized Marketing

Decision Support Systems
YmalDB: exploring relational databases via result-driven recommendations

The VLDB Journal — The International Journal on Very Large Data Bases
Key roles of closed sets and minimal generators in concise representations of frequent patterns

Intelligent Data Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we study an inherent problem of mining Frequent Itemsets (FIs): the number of FIs mined is often too large. The large number of FIs not only affects the mining performance, but also severely thwarts the application of FI mining. In the literature, Closed FIs (CFIs) and Maximal FIs (MFIs) are proposed as concise representations of FIs. However, the number of CFIs is still too large in many cases, while MFIs lose information about the frequency of the FIs. To address this problem, we relax the restrictive definition of CFIs and propose the \delta-Tolerance CFIs (\delta- TCFIs). Mining \delta-TCFIs recursively removes all subsets of a \delta-TCFI that fall within a frequency distance bounded by \delta. We propose two algorithms, CFI2TCFI and MineTCFI, to mine \delta-TCFIs. CFI2TCFI achieves very high accuracy on the estimated frequency of the recovered FIs but is less efficient when the number of CFIs is large, since it is based on CFI mining. MineTCFI is significantly faster and consumes less memory than the algorithms of the state-of-the-art concise representations of FIs, while the accuracy of MineTCFI is only slightly lower than that of CFI2TCFI.