Optimal algorithms for approximate clustering
STOC '88 Proceedings of the twentieth annual ACM symposium on Theory of computing
Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Fast discovery of association rules
Advances in knowledge discovery and data mining
The art of computer programming, volume 3: (2nd ed.) sorting and searching
The art of computer programming, volume 3: (2nd ed.) sorting and searching
SPADE: an efficient algorithm for mining frequent sequences
Machine Learning
A condensed representation to find frequent patterns
PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Principles of data mining
Approximate clustering via core-sets
STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Discovery of relational association rules
Relational Data Mining
Levelwise Search and Borders of Theories in KnowledgeDiscovery
Data Mining and Knowledge Discovery
Discovery of Frequent Episodes in Event Sequences
Data Mining and Knowledge Discovery
Constraint-Based Rule Mining in Large, Dense Databases
Data Mining and Knowledge Discovery
Free-Sets: A Condensed Representation of Boolean Data for the Approximation of Frequency Queries
Data Mining and Knowledge Discovery
Complete Mining of Frequent Patterns from Graphs: Mining Graph Data
Machine Learning
Scalable Algorithms for Association Mining
IEEE Transactions on Knowledge and Data Engineering
Finding Patterns in Three-Dimensional Graphs: Algorithms and Applications to Scientific Data Mining
IEEE Transactions on Knowledge and Data Engineering
Concise Representation of Frequent Patterns Based on Disjunction-Free Generators
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Mining All Non-derivable Frequent Itemsets
PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
Optimal Histograms with Quality Guarantees
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Sampling Large Databases for Association Rules
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Local and Global Methods in Data Mining: Basic Techniques and Open Problems
ICALP '02 Proceedings of the 29th International Colloquium on Automata, Languages and Programming
Pattern Detection and Discovery
Proceedings of the ESF Exploratory Workshop on Pattern Detection and Discovery
Relational Association Rules: Getting WARMeR
Proceedings of the ESF Exploratory Workshop on Pattern Detection and Discovery
Performance Guarantees for Hierarchical Clustering
COLT '02 Proceedings of the 15th Annual Conference on Computational Learning Theory
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Selecting the right interestingness measure for association patterns
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Efficiently mining frequent trees in a forest
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Discovering all most specific sentences
ACM Transactions on Database Systems (TODS)
Approximation schemes for clustering problems
Proceedings of the thirty-fifth annual ACM symposium on Theory of computing
On Computing Condensed Frequent Pattern Bases
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
A Theory of Inductive Query Answering
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
gSpan: Graph-Based Substructure Pattern Mining
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Reliable Detection of Episodes in Event Sequences
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Efficient Data Mining for Maximal Frequent Subtrees
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach
Data Mining and Knowledge Discovery
Efficient dynamic mining of constrained frequent sets
ACM Transactions on Database Systems (TODS)
BIDE: Efficient Mining of Frequent Closed Sequences
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
On clusterings: Good, bad and spectral
Journal of the ACM (JACM)
A local search approximation algorithm for k-means clustering
Computational Geometry: Theory and Applications - Special issue on the 18th annual symposium on computational geometrySoCG2002
A Simple Linear Time (1+ ") -Approximation Algorithm for k-Means Clustering in Any Dimensions
FOCS '04 Proceedings of the 45th Annual IEEE Symposium on Foundations of Computer Science
How Fast Is the k-Means Method?
Algorithmica
Summarizing itemset patterns: a profile-based approach
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
On the Computational Complexity of Optimal Multisplitting
Fundamenta Informaticae - Intelligent Systems
The history of histograms (abridged)
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
New Classification Method Based on Support-Significant Association Rules Algorithm
ICIC '07 Proceedings of the 3rd International Conference on Intelligent Computing: Advanced Intelligent Computing Theories and Applications. With Aspects of Artificial Intelligence
Hi-index | 0.00 |
Finding interesting patterns from data is one of the most important problems in data mining and it has been studied actively for more than a decade. However, it is still largely open problem which patterns are interesting and which are not. The problem of detecting the interesting patterns (in a predefined class of patterns) has been attempted to solve by determining quality values for potentially interesting patterns and deciding a pattern to be interesting if its quality value (i.e., the interestingness of the pattern) is higher than a given threshold value. Again, it is very difficult to find a threshold value and a way to determine the quality values such that the collection of patterns with quality values greater than the threshold value would contain almost all truly interesting patterns and only few uninteresting ones. To enable more accurate characterization of interesting patterns, use of constraints to further prune the pattern collection has been proposed. However, most of the constrained pattern discovery research has been focused on structural constraints for the pattern collections and the patterns. We take a complementary approach and focus on constraining the quality values of the patterns. We propose quality value simplifications as a complementary approach to structural constraints on patterns. As a special case of the quality value simplifications, we consider discretizing the quality values. We analyze the worst-case error of certain discretization functions and give efficient discretization algorithms minimizing several loss functions. In addition to that, we show that the discretizations of the quality values can be used to obtain small approximate condensed representations for collections of interesting patterns. We evaluate the proposed condensation approach experimentally using frequent itemsets.