A distributed index for efficient parallel top-k keyword search on massive graphs

  • Authors:
  • Ming Zhong;Mengchi Liu

  • Affiliations:
  • Wuhan University, Wuhan, China;Carleton University, Ottawa, Canada

  • Venue:
  • Proceedings of the twelfth international workshop on Web information and data management
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recently, a variety of indexing techniques have been proposed for optimizing keyword search on graph. However, graph indexing has very high space and time complexities, and thus these single-machine in-memory indices are usually not affordable for massive graphs. In this paper, we propose a novel distributed disk-based index, which organizes the local topology information in the graph to track and prune matched vertices that will not participate in the top-k answers to a specified query before search with heuristics. The distributed index can be constructed in a MapReduce manner. Moreover, a parallel search algorithm is also developed. It runs multiple asynchronous search instances that incrementally enumerate the current best local answers and then produces the global top-k answers from them. Lastly, we perform experiments on both synthetic and real graphs with various configurations. The results show that our approach can improve search efficiency on massive graphs significantly with affordable indexing overheads.