The R*-tree: an efficient and robust access method for points and rectangles
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Approximate closest-point queries in high dimensions
Information Processing Letters
The SR-tree: an index structure for high-dimensional nearest neighbor queries
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Approximate nearest neighbors: towards removing the curse of dimensionality
STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
Efficient search for approximate nearest neighbor in high dimensional spaces
STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
The art of computer programming, volume 3: (2nd ed.) sorting and searching
The art of computer programming, volume 3: (2nd ed.) sorting and searching
An optimal algorithm for approximate nearest neighbor searching fixed dimensions
Journal of the ACM (JACM)
An optimal algorithm for approximate nearest neighbor searching
SODA '94 Proceedings of the fifth annual ACM-SIAM symposium on Discrete algorithms
Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Continually evaluating similarity-based pattern queries on a streaming time series
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
A class of data structures for associative searching
PODS '84 Proceedings of the 3rd ACM SIGACT-SIGMOD symposium on Principles of database systems
Clustering for Approximate Similarity Search in High-Dimensional Spaces
IEEE Transactions on Knowledge and Data Engineering
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Contrast Plots and P-Sphere Trees: Space vs. Time in Nearest Neighbour Searches
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Indexing the Distance: An Efficient Method to KNN Processing
Proceedings of the 27th International Conference on Very Large Data Bases
The X-tree: An Index Structure for High-Dimensional Data
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
The Universal B-Tree for Multidimensional Indexing: general Concepts
WWCA '97 Proceedings of the International Conference on Worldwide Computing and Its Applications
SSD '95 Proceedings of the 4th International Symposium on Advances in Spatial Databases
Making the Pyramid Technique Robust to Query Types and Workloads
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Comparing data streams using Hamming norms (how to zero in)
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Reverse nearest neighbor aggregates over data streams
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
A regression-based temporal pattern mining scheme for data streams
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Stabbing the Sky: Efficient Skyline Computation over Sliding Windows
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
BRAID: stream mining through group lag correlations
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Conceptual partitioning: an efficient method for continuous nearest neighbor monitoring
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
iDistance: An adaptive B+-tree based indexing method for nearest neighbor search
ACM Transactions on Database Systems (TODS)
Adaptive stream filters for entity-based queries with non-value tolerance
VLDB '05 Proceedings of the 31st international conference on Very large data bases
KLEE: a framework for distributed top-k query algorithms
VLDB '05 Proceedings of the 31st international conference on Very large data bases
A Threshold-Based Algorithm for Continuous Monitoring of k Nearest Neighbors
IEEE Transactions on Knowledge and Data Engineering
Continuous monitoring of top-k queries over sliding windows
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Efficient range-constrained similarity search on wavelet synopses over multiple streams
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Continuous Nearest Neighbor Queries over Sliding Windows
IEEE Transactions on Knowledge and Data Engineering
Efficient Process of Top-k Range-Sum Queries over Multiple Streams with Minimized Global Error
IEEE Transactions on Knowledge and Data Engineering
Best position algorithms for top-k queries
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Categorical skylines for streaming data
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Top-k/w publish/subscribe: finding k most relevant publications in sliding time window w
Proceedings of the second international conference on Distributed event-based systems
Continuous Spatiotemporal Trajectory Joins
GeoSensor Networks
Efficiently Monitoring Nearest Neighbors to a Moving Object
ADMA '07 Proceedings of the 3rd international conference on Advanced Data Mining and Applications
Proceedings of the VLDB Endowment
Continuous proximity monitoring in road networks
Proceedings of the 16th ACM SIGSPATIAL international conference on Advances in geographic information systems
Evaluating probability threshold k-nearest-neighbor queries over uncertain data
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Distributed top-k aggregation queries at large
Distributed and Parallel Databases
Evaluating top-k queries over incomplete data streams
Proceedings of the 18th ACM conference on Information and knowledge management
Enhancing the B+-tree by dynamic node popularity caching
Information Processing Letters
Continuous monitoring of exclusive closest pairs
SSTD'07 Proceedings of the 10th international conference on Advances in spatial and temporal databases
Continuous medoid queries over moving objects
SSTD'07 Proceedings of the 10th international conference on Advances in spatial and temporal databases
The gist of everything new: personalized top-k processing over web 2.0 streams
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Best position algorithms for efficient top-k query processing
Information Systems
Shared execution strategy for neighbor-based pattern mining requests over streaming windows
ACM Transactions on Database Systems (TODS)
A platform for situational awareness in operational BI
Decision Support Systems
Hi-index | 0.00 |
In data stream applications, data arrive continuously and can only be scanned once as the query processor has very limited memory (relative to the size of the stream) to work with. Hence, queries on data streams do not have access to the entire data set and query answers are typically approximate. While there have been many studies on the k Nearest Neighbors (kNN) problem in conventional multi-dimensional databases, the solutions cannot be directly applied to data streams for the above reasons. In this paper, we investigate the kNN problem over data streams. We first introduce the e-approximate kNN (ekNN) problem that finds the approximate kNN answers of a query point Q such that the absolute error of the k-th nearest neighbor distance is bounded by e. To support ekNN queries over streams, we propose a technique called DISC (aDaptive Indexing on Streams by space-filling Curves). DISC can adapt to different data distributions to either (a) optimize memory utilization to answer ekNN queries under certain accuracy requirements or (b) achieve the best accuracy under a given memory constraint. At the same time, DISC provide efficient updates and query processing which are important requirements in data stream applications. Extensive experiments were conducted using both synthetic and real data sets and the results confirm the effectiveness and efficiency of DISC.