Redescription mining: structure theory and algorithms

Authors:
Laxmi Parida;Naren Ramakrishnan
Affiliations:
IBM Thomas J. Watson Research Center, Yorktown Heights, NY;Department of Computer Science, Virginia Tech, VA
Venue:
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Year:
2005

Citing 9
Cited 6

The use of knowledge in analogy and induction

The use of knowledge in analogy and induction
Exact learning Boolean functions via the monotone theory

Information and Computation
Concise, intelligible, and approximate profiling of multiple classes

International Journal of Human-Computer Studies - Special issue on Machine Discovery
Learning to Match the Schemas of Data Sources: A Multistrategy Approach

Machine Learning
Knowledge Acquisition Via Incremental Conceptual Clustering

Machine Learning
Turning CARTwheels: an alternating algorithm for mining redescriptions

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining Non-Redundant Association Rules

Data Mining and Knowledge Discovery
Biclustering Algorithms for Biological Data Analysis: A Survey

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
The levelwise version space algorithm and its application to molecular fragment finding

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2

Algorithms for storytelling

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Compositional mining of multirelational biological datasets

ACM Transactions on Knowledge Discovery from Data (TKDD)
Mining correlated subgraphs in graph databases

PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
On "one of the few" objects

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Siren: an interactive tool for mining and visualizing geospatial redescriptions

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
From black and white to full color: extending redescription mining outside the Boolean world

Statistical Analysis and Data Mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

We introduce a new data mining problem--redescription mining--that unifies considerations of conceptual clustering, constructive induction, and logical formula discovery. Redescription mining begins with a collection of sets, views it as a propositional vocabulary, and identifies clusters of data that can be defined in at least two ways using this vocabulary. The primary contributions of this paper are conceptual and theoretical: (i) we formally study the space of redescriptions underlying a dataset and characterize their intrinsic structure, (ii) we identify impossibility as well as strong possibility results about when mining redescriptions is feasible, (iii) we present several scenarios of how we can custom-build redescription mining solutions for various biases, and (iv) we outline how many problems studied in the larger machine learning community are really special cases of redescription mining. By highlighting its broad scope and relevance. we aim to establish the importance of redescription mining and make the case for a thrust in this new line of research.