Computational geometry: an introduction
Computational geometry: an introduction
Skip lists: a probabilistic alternative to balanced trees
Communications of the ACM
Introduction to algorithms
Computational geometry: algorithms and applications
Computational geometry: algorithms and applications
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
The space complexity of approximating the frequency moments
Journal of Computer and System Sciences
Approximate data structures with applications
SODA '94 Proceedings of the fifth annual ACM-SIAM symposium on Discrete algorithms
The C++ standard library: a tutorial and reference
The C++ standard library: a tutorial and reference
Space-efficient online computation of quantile summaries
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
An Approximate L1-Difference Algorithm for Massive Data Streams
FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
Faster core-set constructions and data stream algorithms in fixed dimensions
SCG '04 Proceedings of the twentieth annual symposium on Computational geometry
Approximating extent measures of points
Journal of the ACM (JACM)
ESA'10 Proceedings of the 18th annual European conference on Algorithms: Part I
Hi-index | 0.00 |
Geometric coordinates are an integral part of many data streams. Examples include sensor locations in environmental monitoring, vehicle locations in traffic monitoring or battlefield simulations, scientific measurements of earth or atmospheric phenomena, etc. This paper focuses on the problem of summarizing such geometric data streams using limited storage so that many natural geometric queries can be answered faithfully. Some examples of such queries are: report the smallest convex region in which a chemical leak has been sensed, or track the diameter of the dataset, or track the extent of the dataset in any given direction. One can also pose queries over multiple streams: for instance, track the minimum distance between the convex hulls of two data streams, report when datasets A and B are no longer linearly separable, or report when points of data stream A become completely surrounded by points of data stream B, etc. These queries are easily extended to more than two streams. In this paper, we propose an adaptive sampling scheme that gives provably optimal error bounds for extremal problems of this nature. All our results follow from a single technique for computing the approximate convex hull of a point stream in a single pass. Our main result is this: given a stream of two-dimensional points and an integer r, we can maintain an adaptive sample of at most 2r+1 points such that the distance between the true convex hull and the convex hull of the sample points is O(D/r^2), where D is the diameter of the sample set. The amortized time for processing each point in the stream is O(logr). Using the sample convex hull, all the queries mentioned above can be answered approximately in either O(logr) or O(r) time.