Learning the ontological theory of an information extraction system in the multi-predicate ILP setting

  • Authors:
  • Alain-Pierre Manine

  • Affiliations:
  • Univ. Paris, Université Paris, Villetaneuse

  • Venue:
  • Proceedings of the 2009 ACM symposium on Applied Computing
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In recent years, numerous works have been carried out to design Information Extraction (IE) systems able to extract genic interaction networks from text. Usually, the extraction procedure is completed by so-called extraction patterns, which are often limited to map textual fragments to a single semantic relation. Such poor representations do not take into account the complexity of the data processed by biologists. IE systems need sophisticated representations, encoded with ontologies, allowing the definition of multiple relations, and of the (possibly recursive) dependencies between them. Up to now, Machine Learning techniques used to acquire extraction patterns, i.e. binary or multi-class learners, reflect those representation restrictions. They assume independence between target predicates, and do not handle recursion. In this paper, we use Inductive Logic Programming in a multi-predicate setting to learn extraction patterns fitted to an ontological context. Multi-predicate ILP is an important paradigm which allows to learn recursive theories. We experimented our framework on a Bacillus subtilis bacterium text corpus, in which we reach a global recall of 67.7% and a precision of 75.5% in ten-fold cross-validation.