Learning the ontological theory of an information extraction system in the multi-predicate ILP setting

Authors:
Alain-Pierre Manine
Affiliations:
Univ. Paris, Université Paris, Villetaneuse
Venue:
Proceedings of the 2009 ACM symposium on Applied Computing
Year:
2009

Citing 9
Cited 0

Constructing Biological Knowledge Bases by Extracting Information from Text Sources

Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology
Learning information extraction patterns from examples

Connectionist, Statistical, and Symbolic Approaches to Learning for Natural Language Processing
Bootstrapping an ontology-based information extraction system

Intelligent exploration of the web
Classifying semantic relations in bioscience texts

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Semantic retrieval for the accurate identification of relational concepts in massive textbases

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Learning Recursive Theories in the Normal ILP Setting

Fundamenta Informaticae
Information Extraction as an Ontology Population Task and Its Application to Genic Interactions

ICTAI '08 Proceedings of the 2008 20th IEEE International Conference on Tools with Artificial Intelligence - Volume 02
Event-based information extraction for the biomedical domain: the Caderige project

JNLPBA '04 Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications
Automatically generating extraction patterns from untagged text

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 2

Quantified Score

Hi-index	0.00

Visualization

Abstract

In recent years, numerous works have been carried out to design Information Extraction (IE) systems able to extract genic interaction networks from text. Usually, the extraction procedure is completed by so-called extraction patterns, which are often limited to map textual fragments to a single semantic relation. Such poor representations do not take into account the complexity of the data processed by biologists. IE systems need sophisticated representations, encoded with ontologies, allowing the definition of multiple relations, and of the (possibly recursive) dependencies between them. Up to now, Machine Learning techniques used to acquire extraction patterns, i.e. binary or multi-class learners, reflect those representation restrictions. They assume independence between target predicates, and do not handle recursion. In this paper, we use Inductive Logic Programming in a multi-predicate setting to learn extraction patterns fitted to an ontological context. Multi-predicate ILP is an important paradigm which allows to learn recursive theories. We experimented our framework on a Bacillus subtilis bacterium text corpus, in which we reach a global recall of 67.7% and a precision of 75.5% in ten-fold cross-validation.