Graph-based concept identification and disambiguation for enterprise search

  • Authors:
  • Falk Brauer;Michael Huber;Gregor Hackenbroich;Ulf Leser;Felix Naumann;Wojciech M. Barczynski

  • Affiliations:
  • SAP AG, Dresden, Germany;ARITHNEA GmbH, Taufkirchen, Germany;SAP AG, Dresden, Germany;Humboldt-Universitaet, Berlin, Germany;Hasso-Plattner-Institut, Potsdam, Germany;SAP AG, Dresden, Germany

  • Venue:
  • Proceedings of the 19th international conference on World wide web
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Enterprise Search (ES) is different from traditional IR due to a number of reasons, among which the high level of ambiguity of terms in queries and documents and existence of graph-structured enterprise data (ontologies) that describe the concepts of interest and their relationships to each other, are the most important ones. Our method identifies concepts from the enterprise ontology in the query and corpus. We propose a ranking scheme for ontology sub-graphs on top of approximately matched token q-grams. The ranking leverages the graph-structure of the ontology to incorporate not explicitly mentioned concepts. It improves previous solutions by using a fine-grained ranking function that is specifically designed to cope with high levels of ambiguity. This method is able to capture much more of the semantics of queries and documents than previous techniques. We prove this claim by an evaluation of our method in three real-life scenarios from two different domains, and found it to consistently be superior both in terms of precision and recall.