From data mining to knowledge discovery: an overview
Advances in knowledge discovery and data mining
Efficient Attribute-Oriented Generalization for Knowledge Discovery from Large Databases
IEEE Transactions on Knowledge and Data Engineering
Parallel Knowledge Discovery Using Domain Generalization Graphs
PKDD '97 Proceedings of the First European Symposium on Principles of Data Mining and Knowledge Discovery
Share Based Measures for Itemsets
PKDD '97 Proceedings of the First European Symposium on Principles of Data Mining and Knowledge Discovery
PKDD '98 Proceedings of the Second European Symposium on Principles of Data Mining and Knowledge Discovery
Identifying Relevant Databases for Multidatabase Mining
PAKDD '98 Proceedings of the Second Pacific-Asia Conference on Research and Development in Knowledge Discovery and Data Mining
Ranking the Interestingness of Summaries from Data Mining Systems
Proceedings of the Twelfth International Florida Artificial Intelligence Research Society Conference
Mining Market Basket Data Using Share Measures and Characterized Itemsets
PAKDD '98 Proceedings of the Second Pacific-Asia Conference on Research and Development in Knowledge Discovery and Data Mining
A Mathematical Theory of Communication
A Mathematical Theory of Communication
Evaluation of Interestingness Measures for Ranking Discovered Knowledge
PAKDD '01 Proceedings of the 5th Pacific-Asia Conference on Knowledge Discovery and Data Mining
ECAL'05 Proceedings of the 8th European conference on Advances in Artificial Life
Hi-index | 0.00 |
We describe heuristics, based upon information theory and statistics, for ranking the interestingness of summaries generated from databases. The tuples in a summary are unique, and therefore, can be considered to be a population described by some probability distribution. The four interestingness measures presented here are based upon common measures of diversity of a population: variance, the Simpson index, and the Shannon index. Using each of the proposed measures, we assign a single real value to a summary that describes its interestingness. Our experimental results show that the ranks assigned by the four interestingness measures are highly correlated.