Aggregate nearest neighbor queries in uncertain graphs

  • Authors:
  • Zhang Liu;Chaokun Wang;Jianmin Wang

  • Affiliations:
  • Department of Computer Science and Technology, Tsinghua University, Beijing, People's Republic of China 100084;School of Software, Tsinghua University, Beijing, People's Republic of China 100084;School of Software, Tsinghua University, Beijing, People's Republic of China 100084

  • Venue:
  • World Wide Web
  • Year:
  • 2014

Quantified Score

Hi-index 0.00

Visualization

Abstract

Most recently, uncertain graph data begin attracting significant interests of database research community, because uncertainty is the intrinsic property of the real-world and data are more suitable to be modeled as graphs in numbers of applications, e.g. social network analysis, PPI networks in biology, and road network monitoring. Meanwhile, as one of the basic query operators, aggregate nearest neighbor (ANN) query retrieves a data entity whose aggregate distance, e.g. sum, max, to the given query data entities is smaller than those of other data entities in a database. ANN query on both certain graph data and high dimensional data has been well studied by previous work. However, existing ANN query processing approaches cannot handle the situation of uncertain graphs, because topological structures of an uncertain graph may vary in different possible worlds. Motivated by this, we propose the aggregate nearest neighbor query in uncertain graphs (UG-ANN) in this paper. First of all, we give the formal definition of UG-ANN query and the basic UG-ANN query algorithm. After that, to improve the efficiency of UG-ANN query processing, we develop two kinds of pruning approaches, i.e. structural pruning and instance pruning. The structural pruning takes advantages the monotonicity of the aggregate distance to derive the upper and lower bounds of the aggregate distance for reducing the graph size. Whereas, the instance pruning decreases the number of possible worlds to be checked in the searching tree. Comprehensive experimental results on real-world data sets demonstrate that the proposed method significantly improves the efficiency of the UG-ANN query processing.