BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
CURE: an efficient clustering algorithm for large databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Approximation schemes for Euclidean k-medians and related problems
STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
OPTICS: ordering points to identify the clustering structure
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
ACM Computing Surveys (CSUR)
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
R-trees: a dynamic index structure for spatial searching
SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
IEEE Transactions on Computers
Analysis of the Clustering Properties of the Hilbert Space-Filling Curve
IEEE Transactions on Knowledge and Data Engineering
Efficient and Effective Clustering Methods for Spatial Data Mining
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
SSD '95 Proceedings of the 4th International Symposium on Advances in Spatial Databases
Clustering Data Streams: Theory and Practice
IEEE Transactions on Knowledge and Data Engineering
SINA: scalable incremental processing of continuous queries in spatio-temporal databases
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Monitoring k-Nearest Neighbor Queries over Moving Objects
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
SEA-CNN: Scalable Processing of Continuous K-Nearest Neighbor Queries in Spatio-temporal Databases
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
A generic framework for monitoring continuous spatial queries over moving objects
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Conceptual partitioning: an efficient method for continuous nearest neighbor monitoring
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
A Threshold-Based Algorithm for Continuous Monitoring of k Nearest Neighbors
IEEE Transactions on Knowledge and Data Engineering
Progressive computation of the min-dist optimal-location query
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Approximate NN queries on streams with guaranteed error/performance bounds
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Tree-based partition querying: a methodology for computing medoids in large spatial datasets
The VLDB Journal — The International Journal on Very Large Data Bases
On-line discovery of hot motion paths
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Time-Aware Similarity Search: A Metric-Temporal Representation for Complex Data
SSTD '09 Proceedings of the 11th International Symposium on Advances in Spatial and Temporal Databases
Efficient mining of skyline objects in subspaces over data streams
Knowledge and Information Systems
Hi-index | 0.00 |
In the k-medoid problem, given a dataset P, we are asked to choose k points in P as the medoids. The optimal medoid set minimizes the average Euclidean distance between the points in P and their closest medoid. Finding the optimal k medoids is NP hard, and existing algorithms aim at approximate answers, i.e., they compute medoids that achieve a small, yet not minimal, average distance. Similarly in this paper, we also aim at approximate solutions. We consider, however, the continuous version of the problem, where the points in P move and our task is to maintain the medoid set on-the-fly (trying to keep the average distance small). To the best of our knowledge, this work constitutes the first attempt on continuous medoid queries. First, we consider centralized monitoring, where the points issue location updates whenever they move. A server processes the stream of generated updates and constantly reports the current medoid set. Next, we address distributed monitoring, where we assume that the data points have some computational capabilities, and they take over part of the monitoring task. In particular, the server installs adaptive filters (i.e., permissible spatial ranges, called safe regions) to the points, which report their location only when they move outside their filters. The distributed techniques reduce the frequency of location updates (and, thus, the network overhead and the server load), at the cost of a slightly higher average distance, compared to the centralized methods. Both our centralized and distributed methods do not make any assumption about the data moving patterns (e.g., velocity vectors, trajectories, etc) and can be applied to an arbitrary number of medoids k. We demonstrate the efficiency and efficacy of our techniques through extensive experiments.