Beyond Market Baskets: Generalizing Association Rules to Dependence Rules
Data Mining and Knowledge Discovery
Eigentaste: A Constant Time Collaborative Filtering Algorithm
Information Retrieval
Pincer-Search: An Efficient Algorithm for Discovering the Maximum Frequent Set
IEEE Transactions on Knowledge and Data Engineering
Mining All Non-derivable Frequent Itemsets
PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Beyond Independence: Probabilistic Models for Query Approximation on Binary Transaction Data
IEEE Transactions on Knowledge and Data Engineering
Turning CARTwheels: an alternating algorithm for mining redescriptions
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Biclustering Algorithms for Biological Data Analysis: A Survey
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Geometric and combinatorial tiles in 0-1 data
PKDD '04 Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases
Tight upper bounds on the number of candidate patterns
ACM Transactions on Database Systems (TODS)
Reasoning about sets using redescription mining
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
TAPER: A Two-Step Approach for All-Strong-Pairs Correlation Query in Large Databases
IEEE Transactions on Knowledge and Data Engineering
What is the Dimension of Your Binary Data?
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Hi-index | 0.00 |
We introduce a new data mining problem: mining truth tables in binary datasets. Given a matrix of objects and the properties they satisfy, a truth table identifies a subset of properties that exhibit maximal variability (and hence, complete independence) in occurrence patterns over the underlying objects. This problem is relevant in many domains, e.g., in bioinformatics where we seek to identify and model independent components of combinatorial regulatory pathways, and in social/economic demographics where we desire to determine independent behavioral attributes of populations. We outline a family of levelwise approaches adapted to mining truth tables, algorithmic optimizations, and applications to bioinformatics and political datasets.