An automated search space reduction methodology for large databases

Authors:
Angel Kuri-Morales
Affiliations:
Departamento de Computación, Instituto Tecnológico Autónomo de México, Mexico
Venue:
ICDM'13 Proceedings of the 13th international conference on Advances in Data Mining: applications and theoretical aspects
Year:
2013

Citing 15
Cited 0

CURE: an efficient clustering algorithm for large databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Snakes and sandwiches: optimal clustering strategies for a data warehouse

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Data clustering: a review

ACM Computing Surveys (CSUR)
Density biased sampling: an improved method for data mining and clustering

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Data mining explained: a manager's guide to customer-centric business intelligence

Data mining explained: a manager's guide to customer-centric business intelligence
Knowledge discovery in data warehouses

ACM SIGMOD Record
On Issues of Instance Selection

Data Mining and Knowledge Discovery
Advances in Instance Selection for Instance-Based Learning Algorithms

Data Mining and Knowledge Discovery
Efficient and Effective Clustering Methods for Spatial Data Mining

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
New unsupervised clustering algorithm for large datasets

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Segmentation problems

Journal of the ACM (JACM)
A divide-and-merge methodology for clustering

Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
A non-linear dimensionality-reduction technique for fast similarity search in large databases

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Scalable Representative Instance Selection and Ranking

ICPR '06 Proceedings of the 18th International Conference on Pattern Recognition - Volume 03
A search space reduction methodology for data mining in large databases

Engineering Applications of Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Given the present need for Customer Relationship and the increased growth of the size of databases, many new approaches to large database clustering and processing have been attempted. In this work we propose a methodology based on the idea that statistically proven search space reduction is possible in practice. Following a previous methodology two clustering models are generated: one corresponding to the full data set and another pertaining to the sampled data set. The resulting empirical distributions were mathematically tested by applying an algorithmic verification.