Capturing truthiness: mining truth tables in binary datasets

Authors:
Clifford Conley Owens, III;T. M. Murali;Naren Ramakrishnan
Affiliations:
Virginia Tech, VA;Virginia Tech, VA;Virginia Tech, VA
Venue:
Proceedings of the 2009 ACM symposium on Applied Computing
Year:
2009

Citing 14
Cited 0

Beyond Market Baskets: Generalizing Association Rules to Dependence Rules

Data Mining and Knowledge Discovery
Eigentaste: A Constant Time Collaborative Filtering Algorithm

Information Retrieval
Pincer-Search: An Efficient Algorithm for Discovering the Maximum Frequent Set

IEEE Transactions on Knowledge and Data Engineering
Mining All Non-derivable Frequent Itemsets

PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Beyond Independence: Probabilistic Models for Query Approximation on Binary Transaction Data

IEEE Transactions on Knowledge and Data Engineering
Turning CARTwheels: an alternating algorithm for mining redescriptions

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Dense itemsets

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Biclustering Algorithms for Biological Data Analysis: A Survey

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Geometric and combinatorial tiles in 0-1 data

PKDD '04 Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases
Tight upper bounds on the number of candidate patterns

ACM Transactions on Database Systems (TODS)
Reasoning about sets using redescription mining

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
TAPER: A Two-Step Approach for All-Strong-Pairs Correlation Query in Large Databases

IEEE Transactions on Knowledge and Data Engineering
What is the Dimension of Your Binary Data?

ICDM '06 Proceedings of the Sixth International Conference on Data Mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

We introduce a new data mining problem: mining truth tables in binary datasets. Given a matrix of objects and the properties they satisfy, a truth table identifies a subset of properties that exhibit maximal variability (and hence, complete independence) in occurrence patterns over the underlying objects. This problem is relevant in many domains, e.g., in bioinformatics where we seek to identify and model independent components of combinatorial regulatory pathways, and in social/economic demographics where we desire to determine independent behavioral attributes of populations. We outline a family of levelwise approaches adapted to mining truth tables, algorithmic optimizations, and applications to bioinformatics and political datasets.