iDocument: using ontologies for extracting and annotating information from unstructured text

Authors:
Benjamin Adrian;Jörn Hees;Ludger Van Elst;Andreas Dengel
Affiliations:
Knowledge Management Department, DFKI, Kaiserslautern, Germany;CS Department, University of Kaiserslautern, Kaiserslautern, Germany;Knowledge Management Department, DFKI, Kaiserslautern, Germany;Knowledge Management Department, DFKI, Kaiserslautern, Germany and CS Department, University of Kaiserslautern, Kaiserslautern, Germany
Venue:
KI'09 Proceedings of the 32nd annual German conference on Advances in artificial intelligence
Year:
2009

Citing 8
Cited 8

Ontology-based extraction and structuring of information from data-rich unstructured documents

Proceedings of the seventh international conference on Information and knowledge management
Bootstrapping an ontology-based information extraction system

Intelligent exploration of the web
Evolving GATE to meet new challenges in language engineering

Natural Language Engineering
Principles of template design

HLT '94 Proceedings of the workshop on Human Language Technology
Evaluating machine learning for information extraction

ICML '05 Proceedings of the 22nd international conference on Machine learning
Design of the MUC-6 evaluation

TIPSTER '96 Proceedings of a workshop on held at Vienna, Virginia: May 6-8, 1996
Towards web information extraction using extraction ontologies and (indirectly) domain ontologies

Proceedings of the 4th international conference on Knowledge capture
Ontology-based information extraction and integration from heterogeneous data sources

International Journal of Human-Computer Studies

Ontology-based information extraction: An introduction and a survey of current approaches

Journal of Information Science
Epiphany: adaptable RDFa generation linking the web of documents to the web of data

EKAW'10 Proceedings of the 17th international conference on Knowledge engineering and management by the masses
The text 2.0 framework: writing web-based gaze-controlled realtime applications quickly and easily

Proceedings of the 2010 workshop on Eye gaze in intelligent human machine interaction
Generic and specific object recognition for semantic retrieval of images

KES'11 Proceedings of the 15th international conference on Knowledge-based and intelligent information and engineering systems - Volume Part I
Semantic retrieval of images by learning from wikipedia

KES'11 Proceedings of the 15th international conference on Knowledge-based and intelligent information and engineering systems - Volume Part IV
From handwriting recognition to ontologie-based information extraction of handwritten notes

KES'11 Proceedings of the 15th international conference on Knowledge-based and intelligent information and engineering systems - Volume Part IV
Case acquisition from text: ontology-based information extraction with SCOOBIE for myCBR

ICCBR'10 Proceedings of the 18th international conference on Case-Based Reasoning Research and Development
Bridging the gap between handwriting recognition and knowledge management

Pattern Recognition Letters

Quantified Score

Hi-index	0.00

Visualization

Abstract

Due to the huge amount of text data in the WWW, annotating unstructured text with semantic markup is a crucial topic in Semantic Web research. This work formally analyzes the incorporation of domain ontologies into information extraction tasks in iDocument. Ontologybased information extraction exploits domain ontologies with formalized and structured domain knowledge for extracting domain-relevant information from un-annotated and unstructured text. iDocument provides a pipeline architecture, an extraction template interface and the ability of exchanging domain ontologies for performing information extraction tasks. This work outlines iDocument's ontology-based architecture, the use of SPARQL queries as extraction templates and an evaluation of iDocument in an automatic document annotation scenario.