Tagging spoken language using written language statistics

  • Authors:
  • Joakim Nivre;Leif Grönqvist;Malin Gustafsson;Torbjörn Lager;Sylvana Sofkova

  • Affiliations:
  • Göteborg University, Göteborg, Sweden;Göteborg University, Göteborg, Sweden;Göteborg University, Göteborg, Sweden;Göteborg University, Göteborg, Sweden;Göteborg University, Göteborg, Sweden

  • Venue:
  • COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
  • Year:
  • 1996

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper reports on two experiments with a probabilistic part-of-speech tagger, trained on a tagged corpus of written Swedish, being used to tag a corpus of (transcribed) spoken Swedish. The results indicate that with very little adaptations an accuracy rate of 85% can be achieved, with an accuracy rate for known words of 90%. In addition, two different treatments of pauses were explored but with no significant gain in accuracy under either condition.