Statistical estimators for aggregate relational algebra queries
ACM Transactions on Database Systems (TODS)
Efficient processing of spatial joins using R-trees
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Incremental distance join algorithms for spatial databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Ripple joins for online aggregation
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Adaptive multi-stage distance join processing
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Efficient algorithms for mining outliers from large data sets
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Scalable Sweeping-Based Spatial Join
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Probabilistic Optimization of Top N Queries
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Distance-based outliers: algorithms and applications
The VLDB Journal — The International Journal on Very Large Data Bases
Mining distance-based outliers in near linear time with randomization and a simple pruning rule
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Monte Carlo Statistical Methods (Springer Texts in Statistics)
Monte Carlo Statistical Methods (Springer Texts in Statistics)
Turbo-charging estimate convergence in DBO
Proceedings of the VLDB Endowment
Supporting ranking queries on uncertain and incomplete data
The VLDB Journal — The International Journal on Very Large Data Bases
Distance-based outlier detection: consolidation and renewed bearing
Proceedings of the VLDB Endowment
Effective and efficient sampling methods for deep web aggregation queries
Proceedings of the 14th International Conference on Extending Database Technology
Synopses for Massive Data: Samples, Histograms, Wavelets, Sketches
Foundations and Trends in Databases
Hi-index | 0.00 |
For a large number of data management problems, it would be very useful to be able to obtain a few samples from a data set, and to use the samples to guess the largest (or smallest) value in the entire data set. Min/max online aggregation, top-k query processing, outlier detection, and distance join are just a few possible applications. This paper details a statistically rigorous, Bayesian approach to attacking this problem. Just as importantly, we demonstrate the utility of our approach by showing how it can be applied to two specific problems that arise in the context of data management.