Faster methods for random sampling
Communications of the ACM
ACM Transactions on Mathematical Software (TOMS)
Computer methods for sampling from the exponential and normal distributions
Communications of the ACM
A note on sampling a tape-file
Communications of the ACM
The Art of Computer Programming Volumes 1-3 Boxed Set
The Art of Computer Programming Volumes 1-3 Boxed Set
Wavelet-based histograms for selectivity estimation
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
An Improved Algorithm for Ordered Sequential Random Sampling
ACM Transactions on Mathematical Software (TOMS)
Mining long sequential patterns in a noisy environment
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Clustering High Dimensional Massive Scientific Datasets
Journal of Intelligent Information Systems
Limiting Result Cardinalities for Multidatabase Queries Using Histograms
BNCOD 18 Proceedings of the 18th British National Conference on Databases: Advances in Databases
Online maintenance of very large random samples
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Multi-scaling sampling: an adaptive sampling method for discovering approximate association rules
Journal of Computer Science and Technology
Quality-Aware Sampling and Its Applications in Incremental Data Mining
IEEE Transactions on Knowledge and Data Engineering
Maintaining very large random samples using the geometric file
The VLDB Journal — The International Journal on Very Large Data Bases
Online maintenance of very large random samples on flash storage
Proceedings of the VLDB Endowment
Feature-preserved sampling over streaming data
ACM Transactions on Knowledge Discovery from Data (TKDD)
Online maintenance of very large random samples on flash storage
The VLDB Journal — The International Journal on Very Large Data Bases
A new approach for generating efficient sample from market basket data
Expert Systems with Applications: An International Journal
Sampling ensembles for frequent patterns
FSKD'05 Proceedings of the Second international conference on Fuzzy Systems and Knowledge Discovery - Volume Part I
Differentially private projected histograms: construction and use for prediction
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
Efficient processing of top-k join queries by attribute domain refinement
ADBIS'12 Proceedings of the 16th East European conference on Advances in Databases and Information Systems
In-situ sampling of a large-scale particle simulation for interactive visualization and analysis
EuroVis'11 Proceedings of the 13th Eurographics / IEEE - VGTC conference on Visualization
Taming massive distributed datasets: data sampling using bitmap indices
Proceedings of the 22nd international symposium on High-performance parallel and distributed computing
Hi-index | 0.00 |
We examine several methods for drawing a sequential random sample of n records from a file containing N records. Method D is recommended for general use. The algorithm is on-line (so that CPU time can be overlapped with I/O), has a small constant memory requirement, and is easy to program. An improved implementation is detailed in the Appendix.