Statistical estimators for aggregate relational algebra queries
ACM Transactions on Database Systems (TODS)
Efficient processing of spatial joins using R-trees
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Incremental distance join algorithms for spatial databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Ripple joins for online aggregation
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Adaptive multi-stage distance join processing
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Efficient algorithms for mining outliers from large data sets
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Scalable Sweeping-Based Spatial Join
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Probabilistic Optimization of Top N Queries
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Efficient User-Adaptable Similarity Search in Large Multimedia Databases
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Distance-based outliers: algorithms and applications
The VLDB Journal — The International Journal on Very Large Data Bases
Mining distance-based outliers in near linear time with randomization and a simple pruning rule
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Rapid detection of significant spatial clusters
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Monte Carlo Statistical Methods (Springer Texts in Statistics)
Monte Carlo Statistical Methods (Springer Texts in Statistics)
Detection of emerging space-time clusters
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
The hunting of the bump: on maximizing statistical discrepancy
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Spatial scan statistics: approximations and performance study
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
RFID-data compression for supporting aggregate queries
ACM Transactions on Database Systems (TODS)
Hi-index | 0.00 |
For a large number of data management problems, it would be very useful to be able to obtain a few samples from a data set, and to use the samples to guess the largest (or smallest) value in the entire data set. Min/max online aggregation, Top-k query processing, outlier detection, and distance join are just a few possible applications. This paper details a statistically rigorous, Bayesian approach to attacking this problem. Just as importantly, we demonstrate the utility of our approach by showing how it can be applied to four specific problems that arise in the context of data management.