Reverse nearest neighbor aggregates over data streams

  • Authors:
  • Flip Korn;S. Muthukrishnan;Divesh Srivastava

  • Affiliations:
  • AT&T Labs-Research;AT&T Labs-Research;AT&T Labs-Research

  • Venue:
  • VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Reverse Nearest Neighbor (RNN) queries have been studied for finite, stored data sets and are of interest for decision support. However, in many applications such as fixed wireless telephony access and sensor-based highway traffic monitoring, the data arrives in a stream and cannot be stored. Exploratory analysis on this data stream can be formalized naturally using the notion of RNN aggregates (RNNAs), which involve the computation of some aggregate (such as C0UNT or MAX DISTANCE) over the set of reverse nearest neighbor "clients" associated with each "server". In this paper, we introduce and investigate the problem of computing three types of RNNA queries over data streams of "client" locations: (i) Max-RNNA: given K servers, return the maximum RNNA over all clients to their closest servers; (ii) List-RNNA: given K servers, return a list of RNNAs over all clients to each of the K servers; and (iii) Opt-RNNA: find a subset of at most K servers for which their RNNAs are below a given threshold. While exact computation of these queries is not possible in the data stream model, we present efficient algorithms to approximately answer these RNNA queries over data streams with error guarantees. We provide analytical proofs of constant factor approximations for many RNNA queries, and complement our analyses with experimental evidence of the accuracy of our techniques.