Top-$\boldsymbol{k}$ query processing over uncertain data in distributed environments

  • Authors:
  • Yongjiao Sun;Ye Yuan;Guoren Wang

  • Affiliations:
  • College of Information Science & Engineering, Northeastern University, Shenyang, China;College of Information Science & Engineering, Northeastern University, Shenyang, China;College of Information Science & Engineering, Northeastern University, Shenyang, China

  • Venue:
  • World Wide Web
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Although top-k queries over uncertain data in centralized databases have been studied widely in recent years, it is still a challenging issue in distributed environments. In distributed environments, such as Peer-to-Peer (P2P) systems and sensor networks, there exists an inherent uncertainty on the data objects due to imprecise measurements and network delays. Therefore, it is necessary to study the problem of how to efficiently retrieve top-k uncertain data objects over distributed environments with minimum network overhead. In this paper, we propose a novel approach of processing uncertain top-k queries in large-scale P2P networks, where datasets are horizontally partitioned over peers. In our approach, each peer constructs an Uncertain Quad-Tree (UQ-Tree) index for its local uncertain data, while the P2P network constructs a global index by summarizing the local indexes. Based on the global index, we propose a spatial-pruning algorithm to reduce communication costs and a distributed-pruning algorithm to reduce computation costs. Extensive experiments are conducted to verify the effectiveness and efficiency of the proposed methods in terms of communication costs and response time.