Combining automatic acquisition of knowledge with machine learning approaches for multilingual temporal recognition and normalization

  • Authors:
  • E. Saquete;O. Ferrández;S. Ferrández;P. Martínez-Barco;R. Muñoz

  • Affiliations:
  • Natural Language Processing and Information System Group, Department of Software and Computing Systems, University of Alicante, Apartado de correos 99, 03080 Alicante, Spain;Natural Language Processing and Information System Group, Department of Software and Computing Systems, University of Alicante, Apartado de correos 99, 03080 Alicante, Spain;Natural Language Processing and Information System Group, Department of Software and Computing Systems, University of Alicante, Apartado de correos 99, 03080 Alicante, Spain;Natural Language Processing and Information System Group, Department of Software and Computing Systems, University of Alicante, Apartado de correos 99, 03080 Alicante, Spain;Natural Language Processing and Information System Group, Department of Software and Computing Systems, University of Alicante, Apartado de correos 99, 03080 Alicante, Spain

  • Venue:
  • Information Sciences: an International Journal
  • Year:
  • 2008

Quantified Score

Hi-index 0.08

Visualization

Abstract

This paper presents an improvement in the temporal expression (TE) recognition phase of a knowledge based system at a multilingual level. For this purpose, the combination of different approaches applied to the recognition of temporal expressions are studied. In this work, for the recognition task, a knowledge based system that recognizes temporal expressions and had been automatically extended to other languages (TERSEO system) was combined with a system that recognizes temporal expressions using machine learning techniques. In particular, two different techniques were applied: maximum entropy model (ME) and hidden Markov model (HMM), using two different types of tagging of the training corpus: (1) BIO model tagging of literal temporal expressions and (2) BIO model tagging of simple patterns of temporal expressions. Each system was first evaluated independently and then combined in order to: (a) analyze if the combination gives better results without increasing the number of erroneous expressions in the same percentage and (b) decide which machine learning approach performs this task better. When the TERSEO system is combined with the maximum entropy approach the best results for F-measure (89%) are obtained, improving TERSEO recognition by 4.5 points and ME recognition by 7.