Semantic annotation of transcribed audio broadcast news using contextual features in graphical discriminative models

  • Authors:
  • Azeddine Zidouni;Hervé Glotin

  • Affiliations:
  • Laboratoire LSIS UMR CNRS 6168, Université Aix-Marseille 2, France;Laboratoire LSIS UMR CNRS 6168, Université du sud Toulon-Var, France

  • Venue:
  • CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we propose an efficient approach to perform named entities retrieval (NER) using their hierarchical structure in transcribed speech documents. The NER task consists of identifying and classifying every word in a document into some predefined categories such as person name, locations, organizations, and dates. Usually the classical NER systems use generative approaches to learn models considering only the words characteristics (word context). In this work we show that NER is also sensitive to syntactic and semantic contexts. For this reason, we introduce an extension of conditional random fields (CRFs) approach to consider multiple contexts. We present an adaptation of the text-approach to the automatic speech recognition (ASR) outputs. Experimental results show that the proposed approach outperformed a CRFs simple application. Our experiments are done using ESTER 2 campaign data. The proposed approach is ranked in 4th position in ESTER 2 participating sites, it achieves a significant relative improvement of 18% in slot rate error (SER) measure over HMMs method.