Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Matrix computations (3rd ed.)
Efficiently mining long patterns from databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Data mining: concepts and techniques
Data mining: concepts and techniques
KDD-Cup 2000 organizers' report: peeling the onion
ACM SIGKDD Explorations Newsletter - Special issue on “Scalable data mining algorithms”
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Automatic Scheduler for Real-Time Vision Applications
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Efficiently mining frequent trees in a forest
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining Top.K Frequent Closed Patterns without Minimum Support
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Beyond Independence: Probabilistic Models for Query Approximation on Binary Transaction Data
IEEE Transactions on Knowledge and Data Engineering
XRules: an effective structural classifier for XML data
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
State of the art of graph-based data mining
ACM SIGKDD Explorations Newsletter
Mining protein family specific residue packing patterns from protein structure graphs
RECOMB '04 Proceedings of the eighth annual international conference on Resaerch in computational molecular biology
Approximating a collection of frequent sets
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Summarizing itemset patterns: a profile-based approach
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Mining compressed frequent-pattern sets
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Systematic Approach for Optimizing Complex Mining Tasks on Multiple Databases
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Extracting redundancy-aware top-k patterns
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Summarizing itemset patterns using probabilistic models
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Data Mining and Knowledge Discovery
Computing frequent itemsets inside oracle 10G
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Factor graphs and the sum-product algorithm
IEEE Transactions on Information Theory
Cartesian contour: a concise representation for a collection of frequent sets
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
CP-summary: a concise representation for browsing frequent itemsets
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Output space sampling for graph patterns
Proceedings of the VLDB Endowment
Mining representative subspace clusters in high-dimensional data
FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 1
Block interaction: a generative summarization scheme for frequent patterns
Proceedings of the ACM SIGKDD Workshop on Useful Patterns
ESTATE: strategy for exploring labeled spatial datasets using association analysis
DS'10 Proceedings of the 13th international conference on Discovery science
Cube based summaries of large association rule sets
ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications: Part I
Summarizing frequent itemsets via pignistic transformation
EPIA'11 Proceedings of the 15th Portugese conference on Progress in artificial intelligence
Finding minimum representative pattern sets
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
International Journal of Data Warehousing and Mining
Summarizing probabilistic frequent patterns: a fast approach
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Frequent subgraph summarization with error control
WAIM'13 Proceedings of the 14th international conference on Web-Age Information Management
Hi-index | 0.00 |
In this paper, we propose a set of novel regression-based approaches to effectively and efficiently summarize frequent itemset patterns. Specifically, we show that the problem of minimizing the restoration error for a set of itemsets based on a probabilistic model corresponds to a non-linear regression problem. We show that under certain conditions, we can transform the nonlinear regression problem to a linear regression problem. We propose two new methods, k-regression and tree-regression, to partition the entire collection of frequent itemsets in order to minimize the restoration error. The K-regression approach, employing a K-means type clustering method, guarantees that the total restoration error achieves a local minimum. The tree-regression approach employs a decision-tree type of top-down partition process. In addition, we discuss alternatives to estimate the frequency for the collection of itemsets being covered by the k representative itemsets. The experimental evaluation on both real and synthetic datasets demonstrates that our approaches significantly improve the summarization performance in terms of both accuracy (restoration error), and computational cost.