Improving chunking by means of lexical-contextual information in statistical language models

  • Authors:
  • Ferran Pla;Antonio Molina;Natividad Prieto

  • Affiliations:
  • Universitat Politècnica de València, València, Spain;Universitat Politècnica de València, València, Spain;Universitat Politècnica de València, València, Spain

  • Venue:
  • ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
  • Year:
  • 2000

Quantified Score

Hi-index 0.01

Visualization

Abstract

In this work, we present a stochastic approach to shallow parsing. Most of the current approaches to shallow parsing have a common characteristic: they take the sequence of lexical tags proposed by a POS tagger as input for the chunking process. Our system produces tagging and chunking in a single process using an Integrated Language Model (ILM) formalized as Markov Models. This model integrates several knowledge sources: lexical probabilities, a contextual Language Model (LM) for every chunk, and a contextual LM for the sentences. We have extended the ILM by adding lexical information to the contextual LMs. We have applied this approach to the CoNLL-2000 shared task improving the performance of the chunker.