Data mining, hypergraph transversals, and machine learning (extended abstract)
PODS '97 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Statistical methods for speech recognition
Statistical methods for speech recognition
Mining frequent patterns without candidate generation
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Mining All Non-derivable Frequent Itemsets
PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Mining Top.K Frequent Closed Patterns without Minimum Support
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
New Algorithms for Fast Discovery of Association Rules
New Algorithms for Fast Discovery of Association Rules
Beyond Independence: Probabilistic Models for Query Approximation on Binary Transaction Data
IEEE Transactions on Knowledge and Data Engineering
Approximating a collection of frequent sets
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Summarizing itemset patterns: a profile-based approach
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Cache-conscious frequent pattern mining on a modern processor
VLDB '05 Proceedings of the 31st international conference on Very large data bases
GenMax: An Efficient Algorithm for Mining Maximal Frequent Itemsets
Data Mining and Knowledge Discovery
From frequent itemsets to semantically meaningful visual patterns
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Effective and efficient itemset pattern summarization: regression-based approaches
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
On effective presentation of graph patterns: a structural representative approach
Proceedings of the 17th ACM conference on Information and knowledge management
Cartesian contour: a concise representation for a collection of frequent sets
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
CP-summary: a concise representation for browsing frequent itemsets
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient algorithms for mining constrained frequent patterns from uncertain data
Proceedings of the 1st ACM SIGKDD Workshop on Knowledge Discovery from Uncertain Data
Output space sampling for graph patterns
Proceedings of the VLDB Endowment
Mining problem-solving strategies from HCI data
ACM Transactions on Computer-Human Interaction (TOCHI)
MCD'07 Proceedings of the 3rd ECML/PKDD international conference on Mining complex data
Mining representative subspace clusters in high-dimensional data
FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 1
Efficient algorithms for the mining of constrained frequent patterns from uncertain data
ACM SIGKDD Explorations Newsletter
Block interaction: a generative summarization scheme for frequent patterns
Proceedings of the ACM SIGKDD Workshop on Useful Patterns
Mining periodic behaviors for moving objects
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Towards site-based protein functional annotations
International Journal of Data Mining and Bioinformatics
ESTATE: strategy for exploring labeled spatial datasets using association analysis
DS'10 Proceedings of the 13th international conference on Discovery science
Krimp: mining itemsets that compress
Data Mining and Knowledge Discovery
MoveMine: Mining moving object data for discovery of animal movement patterns
ACM Transactions on Intelligent Systems and Technology (TIST)
Tell me what i need to know: succinctly summarizing data with itemsets
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Comparing apples and oranges: measuring differences between data mining results
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III
Summarizing frequent itemsets via pignistic transformation
EPIA'11 Proceedings of the 15th Portugese conference on Progress in artificial intelligence
Mining periodic behaviors of object movements for animal and biological sustainability studies
Data Mining and Knowledge Discovery
Finding minimum representative pattern sets
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
A framework for summarizing and analyzing twitter feeds
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Substructure clustering: a novel mining paradigm for arbitrary data types
SSDBM'12 Proceedings of the 24th international conference on Scientific and Statistical Database Management
Summarizing data succinctly with the most informative itemsets
ACM Transactions on Knowledge Discovery from Data (TKDD) - Special Issue on the Best of SIGKDD 2011
Summarizing categorical data by clustering attributes
Data Mining and Knowledge Discovery
Frequent subgraph summarization with error control
WAIM'13 Proceedings of the 14th international conference on Web-Age Information Management
Hi-index | 0.00 |
In this paper, we propose a novel probabilistic approach to summarize frequent itemset patterns. Such techniques are useful for summarization, post-processing, and end-user interpretation, particularly for problems where the resulting set of patterns are huge. In our approach items in the dataset are modeled as random variables. We then construct a Markov Random Fields (MRF) on these variables based on frequent itemsets and their occurrence statistics. The summarization proceeds in a level-wise iterative fashion. Occurrence statistics of itemsets at the lowest level are used to construct an initial MRF. Statistics of itemsets at the next level can then be inferred from the model. We use those patterns whose occurrence can not be accurately inferred from the model to augment the model in an iterative manner, repeating the procedure until all frequent itemsets can be modeled. The resulting MRF model affords a concise and useful representation of the original collection of itemsets. Extensive empirical study on real datasets show that the new approach can effectively summarize a large number of itemsets and typically significantly outperforms extant approaches.