Discovering interesting holes in data

Authors:
Bing Liu;Liang-Ping Ku;Wynne Hsu
Affiliations:
Department of Information Systems and Computer Science, National University of Singapore, Singapore;Department of Information Systems and Computer Science, National University of Singapore, Singapore;Department of Information Systems and Computer Science, National University of Singapore, Singapore
Venue:
IJCAI'97 Proceedings of the Fifteenth international joint conference on Artifical intelligence - Volume 2
Year:
1997

Citing 9
Cited 18

Computational geometry: an introduction

Computational geometry: an introduction
Computing the largest empty rectangle

SIAM Journal on Computing
Scientific discovery: computational explorations of the creative process

Scientific discovery: computational explorations of the creative process
C4.5: programs for machine learning

C4.5: programs for machine learning
Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Integrating Quantitative and Qualitative Discovery: The ABACUS System

Machine Learning
An Integrated Framework for Empirical Discovery

Machine Learning
Knowledge Acquisition Via Incremental Conceptual Clustering

Machine Learning
Post-analysis of learned rules

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1

Mining interesting knowledge using DM-II

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining for Empty Rectangles in Large Data Sets

ICDT '01 Proceedings of the 8th International Conference on Database Theory
Efficient Mining of Niches and Set Routines

PAKDD '01 Proceedings of the 5th Pacific-Asia Conference on Knowledge Discovery and Data Mining
Mining for empty spaces in large data sets

Theoretical Computer Science - Database theory
Mining dependence rules by finding largest itemset support quota

Proceedings of the 2004 ACM symposium on Applied computing
Improving GA search reliability using maximal hyper-rectangle analysis

GECCO '05 Proceedings of the 7th annual conference on Genetic and evolutionary computation
A rank-by-feature framework for interactive exploration of multidimensional data

Information Visualization
Fast mining of high dimensional expressive contrast patterns using zero-suppressed binary decision diagrams

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Fast algorithms for finding disjoint subsequences with extremal densities

Pattern Recognition
An efficient implementation of a quasi-polynomial algorithm for generating hypergraph transversals and its application in joint generation

Discrete Applied Mathematics - Special issue: Discrete algorithms and optimization, in honor of professor Toshihide Ibaraki at his retirement from Kyoto University
Dual-bounded generating problems: Efficient and inefficient points for discrete probability distributions and sparse boxes for multidimensional data

Theoretical Computer Science
Generating all minimal integral solutions to AND-OR systems of monotone inequalities: Conjunctions are simpler than disjunctions

Discrete Applied Mathematics
Complexity-guided case discovery for case based reasoning

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 1
An intersection inequality for discrete distributions and related generation problems

ICALP'03 Proceedings of the 30th international conference on Automata, languages and programming
Minimum variance associations: discovering relationships in numerical data

PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
Fast algorithms for finding disjoint subsequences with extremal densities

ISAAC'05 Proceedings of the 16th international conference on Algorithms and Computation
Generating all minimal integral solutions to monotone ∧,∨-systems of linear, transversal and polymatroid inequalities

MFCS'05 Proceedings of the 30th international conference on Mathematical Foundations of Computer Science
Localized geometric query problems

Computational Geometry: Theory and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Current machine learning and discovery techniques focus on discovering rules or regularities that exist in data. An important aspect of the research that has been ignored in the past is the learning or discovering of interesting holes in the database. If we view each case in the database as a point in a it-dimensional space, then a hole is simply a region in the space that contains no data point. Clearly, not every hole is interesting. Some holes are obvious because it is known that certain value combinations are not possible. Some holes exist because there are insufficient cases in the database. However, in some situations, empty regions do carry important information. For instance, they could warn us about some missing value combinations that are either not known before or are unexpected. Knowing these missing value combinations may lead to significant discoveries. In this paper, we propose an algorithm to discover holes in databases.