Disambiguating Keyword Queries on RDF Databases Using "Deep" Segmentation

  • Authors:
  • Haizhou Fu;Sidan Gao;Kemafor Anyanwu

  • Affiliations:
  • -;-;-

  • Venue:
  • ICSC '10 Proceedings of the 2010 IEEE Fourth International Conference on Semantic Computing
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Keyword search on (semi)structured databases is an increasingly popular research topic. But existing techniques do not deal well with the problems presented by the queries that are ambiguous. Recent approaches for RDF databases try to improve the quality of results by introducing an explicit top-k “interpretation” phase in which queries are translated into an ordered list of “most likely intended” structured (SPARQL) queries before query execution. However, even these recent techniques only address keyword query ambiguity in a limited fashion by identifying fine-grained semantic units or segments of a query. This enables some reduction in the space of interpretations, pruning away incorrect interpretations, but the reduction in interpretation space is not as aggressive as it could be. In this paper, we propose a “deep segmentation” technique for keyword queries issued against an RDF database. This approach achieves a more aggressive pruning of irrelevant interpretations from the space of interpretations considered and therefore produces better quality query interpretations even in the presence of significant query ambiguity. We present results for a comprehensive human-based evaluation that is based on a metric that we introduce called degree of ambiguity (DOTA) that has not been considered by previous efforts. The experimental results show that our approach outperforms existing techniques in terms of quality even when queries are very ambiguous.