CURE: an efficient clustering algorithm for large databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Snakes and sandwiches: optimal clustering strategies for a data warehouse
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
ACM Computing Surveys (CSUR)
Density biased sampling: an improved method for data mining and clustering
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Data mining explained: a manager's guide to customer-centric business intelligence
Data mining explained: a manager's guide to customer-centric business intelligence
Knowledge discovery in data warehouses
ACM SIGMOD Record
On Issues of Instance Selection
Data Mining and Knowledge Discovery
Advances in Instance Selection for Instance-Based Learning Algorithms
Data Mining and Knowledge Discovery
Efficient and Effective Clustering Methods for Spatial Data Mining
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
New unsupervised clustering algorithm for large datasets
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Journal of the ACM (JACM)
A divide-and-merge methodology for clustering
Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
A non-linear dimensionality-reduction technique for fast similarity search in large databases
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Scalable Representative Instance Selection and Ranking
ICPR '06 Proceedings of the 18th International Conference on Pattern Recognition - Volume 03
A search space reduction methodology for data mining in large databases
Engineering Applications of Artificial Intelligence
Hi-index | 0.00 |
Given the present need for Customer Relationship and the increased growth of the size of databases, many new approaches to large database clustering and processing have been attempted. In this work we propose a methodology based on the idea that statistically proven search space reduction is possible in practice. Following a previous methodology two clustering models are generated: one corresponding to the full data set and another pertaining to the sampled data set. The resulting empirical distributions were mathematically tested by applying an algorithmic verification.