Data Mining and Knowledge Discovery
Discovering Significant Patterns
Machine Learning
Assessing data mining results via swap randomization
ACM Transactions on Knowledge Discovery from Data (TKDD)
A framework for mining interesting pattern sets
Proceedings of the ACM SIGKDD Workshop on Useful Patterns
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Using background knowledge to rank itemsets
Data Mining and Knowledge Discovery
Preservation of statistically significant patterns in multiresolution 0-1 data
PRIB'10 Proceedings of the 5th IAPR international conference on Pattern recognition in bioinformatics
Summarising data by clustering items
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part II
A framework for mining interesting pattern sets
ACM SIGKDD Explorations Newsletter
An information theoretic framework for data mining
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Tell me what i need to know: succinctly summarizing data with itemsets
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Comparing apples and oranges: measuring differences between data mining results
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III
Maximum entropy models and subjective interestingness: an application to tiles in binary databases
Data Mining and Knowledge Discovery
Multiple hypothesis testing in pattern discovery
DS'11 Proceedings of the 14th international conference on Discovery science
Summarizing data succinctly with the most informative itemsets
ACM Transactions on Knowledge Discovery from Data (TKDD) - Special Issue on the Best of SIGKDD 2011
A pattern mining based integrative framework for biomarker discovery
Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine
Knowledge discovery interestingness measures based on unexpectedness
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
Discovering descriptive tile trees: by mining optimal geometric subtiles
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I
IDA'12 Proceedings of the 11th international conference on Advances in Intelligent Data Analysis
Summarizing categorical data by clustering attributes
Data Mining and Knowledge Discovery
A statistical significance testing approach to mining the most informative set of patterns
Data Mining and Knowledge Discovery
Interesting pattern mining in multi-relational data
Data Mining and Knowledge Discovery
Hi-index | 0.00 |
There is a wide variety of data mining methods available, and it is generally useful in exploratory data analysis to use many different methods for the same dataset. This, however, leads to the problem of whether the results found by one method are a reflection of the phenomenon shown by the results of another method, or whether the results depict in some sense unrelated properties of the data. For example, using clustering can give indication of a clear cluster structure, and computing correlations between variables can show that there are many significant correlations in the data. However, it can be the case that the correlations are actually determined by the cluster structure. In this paper, we consider the problem of randomizing data so that previously discovered patterns or models are taken into account. The randomization methods can be used in iterative data mining. At each step in the data mining process, the randomization produces random samples from the set of data matrices satisfying the already discovered patterns or models. That is, given a data set and some statistics (e.g., cluster centers or co-occurrence counts) of the data, the randomization methods sample data sets having similar values of the given statistics as the original data set. We use Metropolis sampling based on local swaps to achieve this. We describe experiments on real data that demonstrate the usefulness of our approach. Our results indicate that in many cases, the results of, e.g., clustering actually imply the results of, say, frequent pattern discovery.