Density biased sampling: an improved method for data mining and clustering
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Efficient Biased Sampling for Approximate Clustering and Outlier Detection in Large Data Sets
IEEE Transactions on Knowledge and Data Engineering
Indexed-based density biased sampling for clustering applications
Data & Knowledge Engineering
Biased box sampling - a density-biased sampling for clustering
Proceedings of the 2007 ACM symposium on Applied computing
A general stochastic clustering method for automatic cluster discovery
Pattern Recognition
Pairwise similarity for cluster ensemble problem: link-based and approximate approaches
Transactions on Large-Scale Data- and Knowledge-centered systems IX
Hi-index | 0.00 |
The volume and complexity of data collected by modern applications has grown significantly, leading to increasingly costly operations for both data manipulation and analysis. Sampling is an useful technique to support manager a more sensible volume in the data reduction process. Uniform sampling has been widely used but, in datasets exhibiting skewed cluster distribution, biased sampling shows better results. This paper presents the BBS - Biased Box Samplingalgorithm which aims at keeping the skewed tendency of the clusters from the original data. We also present experimental results obtained with the proposed BBS algorithm.