Keyword proximity search in complex data graphs

  • Authors:
  • Konstantin Golenberg;Benny Kimelfeld;Yehoshua Sagiv

  • Affiliations:
  • The Hebrew University of Jerusalem, Jerusalem, Israel;The Hebrew University of Jerusalem, Jerusalem, Israel;The Hebrew University of Jerusalem, Jerusalem, Israel

  • Venue:
  • Proceedings of the 2008 ACM SIGMOD international conference on Management of data
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

In keyword search over data graphs, an answer is a nonredundant subtree that includes the given keywords. An algorithm for enumerating answers is presented within an architecture that has two main components: an engine that generates a set of candidate answers and a ranker that evaluates their score. To be effective, the engine must have three fundamental properties. It should not miss relevant answers, has to be efficient and must generate the answers in an order that is highly correlated with the desired ranking. It is shown that none of the existing systems has implemented an engine that has all of these properties. In contrast, this paper presents an engine that generates all the answers with provable guarantees. Experiments show that the engine performs well in practice. It is also shown how to adapt this engine to queries under the OR semantics. In addition, this paper presents a novel approach for implementing rankers destined for eliminating redundancy. Essentially, an answer is ranked according to its individual properties (relevancy) and its intersection with the answers that have already been presented to the user. Within this approach, experiments with specific rankers are described.