Summarizing answer graphs induced by keyword queries

  • Authors:
  • Yinghui Wu;Shengqi Yang;Mudhakar Srivatsa;Arun Iyengar;Xifeng Yan

  • Affiliations:
  • University of California Santa Barbara;University of California Santa Barbara;IBM Research;IBM Research;University of California Santa Barbara

  • Venue:
  • Proceedings of the VLDB Endowment
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Keyword search has been popularly used to query graph data. Due to the lack of structure support, a keyword query might generate an excessive number of matches, referred to as "answer graphs", that could include different relationships among keywords. An ignored yet important task is to group and summarize answer graphs that share similar structures and contents for better query interpretation and result understanding. This paper studies the summarization problem for the answer graphs induced by a keyword query Q. (1) A notion of summary graph is proposed to characterize the summarization of answer graphs. Given Q and a set of answer graphs G, a summary graph preserves the relation of the keywords in Q by summarizing the paths connecting the keywords nodes in G. (2) A quality metric of summary graphs, called coverage ratio, is developed to measure information loss of summarization. (3) Based on the metric, a set of summarization problems are formulated, which aim to find minimized summary graphs with certain coverage ratio. (a) We show that the complexity of these summarization problems ranges from ptime to NP-complete. (b) We provide exact and heuristic summarization algorithms. (4) Using real-life and synthetic graphs, we experimentally verify the effectiveness and the efficiency of our techniques.