ACM SIGIR Forum
Managing information extraction: state of the art and research directions
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Efficiently linking text documents with relevant structured information
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
International Journal on Document Analysis and Recognition
An Algebraic Approach to Rule-Based Information Extraction
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
NAGA: Searching and Ranking Knowledge
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Graph-based concept identification and disambiguation for enterprise search
Proceedings of the 19th international conference on World wide web
ROXXI: Reviving witness dOcuments to eXplore eXtracted Information
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
Developer communities built around software products, like the SAP Community Network, provide a knowledge base for reocurring problems and their solutions. Due to the large amount of content maintained in such communities, e.g., in forums, finding relevant solutions is a major challenge beyond the scope of common keyword-based search engines. In fact, it is measured that around 50% of the forum questions of our particular scenario have already been answered at the time they are posted. We target this challenge by an entity aware search, which exploits structured knowledge, such as domain-specific ontologies, for both query interpretation and creation of document indexes. The system takes a natural language query as input, interprets it as an entity graph, matches this graph with pre-processed content and supports the user in refining his query based on the top-k relevant entities. Results are presented in a user interface that supports faceted search based on entities. Additionally, the user interface is structured according to possible search intentions of users. The evaluation of our system on the SCN scenario yields that the top 5 entities in user queries are recognized with a precision of 83% compared to 61% of state of the art algorithms.