Efficient reachability query evaluation in large spatiotemporal contact datasets

  • Authors:
  • Houtan Shirani-Mehr;Farnoush Banaei-Kashani;Cyrus Shahabi

  • Affiliations:
  • Univ. of Southern California;Univ. of Southern California;Univ. of Southern California

  • Venue:
  • Proceedings of the VLDB Endowment
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

With the advent of reliable positioning technologies and prevalence of location-based services, it is now feasible to accurately study the propagation of items such as infectious viruses, sensitive information pieces, and malwares through a population of moving objects, e.g., individuals, mobile devices, and vehicles. In such application scenarios, an item passes between two objects when the objects are sufficiently close (i.e., when they are, so-called, in contact), and hence once an item is initiated, it can penetrate the object population through the evolving network of contacts among objects, termed contact network. In this paper, for the first time we define and study reachability queries in large (i.e., disk-resident) contact datasets which record the movement of a (potentially large) set of objects moving in a spatial environment over an extended time period. A reachability query verifies whether two objects are "reachable" through the evolving contact network represented by such contact datasets. We propose two contact-dataset indexes that enable efficient evaluation of such queries despite the potentially humongous size of the contact datasets. With the first index, termed ReachGrid, at the query time only a small necessary portion of the contact network which is required for reachability evaluation is constructed and traversed. With the second approach, termed ReachGraph, we precompute reachability at different scales and leverage these precalculations at the query time for efficient query processing. We optimize the placement of both indexes on disk to enable efficient index traversal during query processing. We study the pros and cons of our proposed approaches by performing extensive experiments with both real and synthetic data. Based on our experimental results, our proposed approaches outperform existing reachability query processing techniques in contact networks by 76% on average.