RankIE: document retrieval on ranked entity graphs

Authors:
Falk Brauer;Wojciech Barczynski;Gregor Hackenbroich;Marcus Schramm;Adrian Mocan;Felix Förster
Affiliations:
SAP AG, SAP Research Chemnitzer, Dresden, Germany;SAP AG, SAP Research Chemnitzer, Dresden, Germany;SAP AG, SAP Research Chemnitzer, Dresden, Germany;SAP AG, SAP Research Chemnitzer, Dresden, Germany;SAP AG, SAP Research Chemnitzer, Dresden, Germany;SAP AG, SAP Research Chemnitzer, Dresden, Germany
Venue:
Proceedings of the VLDB Endowment
Year:
2009

Citing 6
Cited 3

A taxonomy of web search

ACM SIGIR Forum
Managing information extraction: state of the art and research directions

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Efficiently linking text documents with relevant structured information

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Unsupervised information extraction from unstructured, ungrammatical data sources on the World Wide Web

International Journal on Document Analysis and Recognition
An Algebraic Approach to Rule-Based Information Extraction

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
NAGA: Searching and Ranking Knowledge

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering

Graph-based concept identification and disambiguation for enterprise search

Proceedings of the 19th international conference on World wide web
ROXXI: Reviving witness dOcuments to eXplore eXtracted Information

Proceedings of the VLDB Endowment
S3K: seeking statement-supporting top-K witnesses

Quantified Score

Hi-index	0.00

Visualization

Abstract

Developer communities built around software products, like the SAP Community Network, provide a knowledge base for reocurring problems and their solutions. Due to the large amount of content maintained in such communities, e.g., in forums, finding relevant solutions is a major challenge beyond the scope of common keyword-based search engines. In fact, it is measured that around 50% of the forum questions of our particular scenario have already been answered at the time they are posted. We target this challenge by an entity aware search, which exploits structured knowledge, such as domain-specific ontologies, for both query interpretation and creation of document indexes. The system takes a natural language query as input, interprets it as an entity graph, matches this graph with pre-processed content and supports the user in refining his query based on the top-k relevant entities. Results are presented in a user interface that supports faceted search based on entities. Additionally, the user interface is structured according to possible search intentions of users. The evaluation of our system on the SCN scenario yields that the top 5 entities in user queries are recognized with a precision of 83% compared to 61% of state of the art algorithms.