ROXXI: Reviving witness dOcuments to eXplore eXtracted Information

Authors:
Shady Elbassuoni;Katja Hose;Steffen Metzger;Ralf Schenkel
Affiliations:
Max-Planck Institute for Informatics, Saarbrücken, Germany;Max-Planck Institute for Informatics, Saarbrücken, Germany;Max-Planck Institute for Informatics, Saarbrücken, Germany;Saarland University and MPI for Informatics, Saarbrücken, Germany
Venue:
Proceedings of the VLDB Endowment
Year:
2010

Citing 8
Cited 2

Finding relevant documents using top ranking sentences: an evaluation of two alternative schemes

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Extracting Patterns and Relations from the World Wide Web

WebDB '98 Selected papers from the International Workshop on The World Wide Web and Databases
Web-scale information extraction in knowitall: (preliminary results)

Proceedings of the 13th international conference on World Wide Web
YAGO: A Large Ontology from Wikipedia and WordNet

Web Semantics: Science, Services and Agents on the World Wide Web
SOFIE: a self-organizing framework for information extraction

Proceedings of the 18th international conference on World wide web
NAGA: Searching and Ranking Knowledge

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
RankIE: document retrieval on ranked entity graphs

Proceedings of the VLDB Endowment
DBpedia: a nucleus for a web of open data

ISWC'07/ASWC'07 Proceedings of the 6th international The semantic web and 2nd Asian conference on Asian semantic web conference

S3K: seeking statement-supporting top-K witnesses
LUKe and MIKe: learning from user knowledge and managing interactive knowledge extraction

Proceedings of the 21st ACM international conference on Information and knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

In recent years, there has been considerable research on information extraction and constructing RDF knowledge bases. In general, the goal is to extract all relevant information from a corpus of documents, store it into an ontology, and answer future queries based only on the created knowledge base. Thus, the original documents become dispensable. On the one hand, an ontology is a convenient and non-redundant structured source of information, based on which specific queries can be answered efficiently. On the other hand, many users doubt the correctness of facts and ontology subgraphs presented to them as query results without proof. Instead, users often wish to verify the obtained facts or subgraphs by reading about them in context, i.e., in a document relating the facts and providing background information. In this demo, we present ROXXI, a system operating on top of an existing knowledge base and reviving the abandoned witness documents. In doing so, it goes the opposite way of information extraction approaches -- starting with ontological facts and tracing their way back to the documents they were extracted from. ROXXI offers interfaces for expert users (SPARQL) as well as for non-experts (ontology browser) and provides a ranked list of documents each associated with a content snippet highlighting the queried facts in context. At the demonstration site, we will show the advantages of this novel approach towards document retrieval and illustrate the benefits of reviving the documents that information extraction approaches neglect.