Recognizing text genres with simple metrics using discriminant analysis

  • Authors:
  • Jussi Karlgren;Douglass Cutting

  • Affiliations:
  • Swedish Institute of Computer Science, Kista, Stockholm, Sweden;Apple Computer, Cupertino, CA

  • Venue:
  • COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
  • Year:
  • 1994

Quantified Score

Hi-index 0.00

Visualization

Abstract

A simple method for categorizing texts into pre-determined text genre categories using the statistical standard technique of discriminant analysis is demonstrated with application to the Brown corpus. Discriminant analysis makes it possible use a large number of parameters that may be specific for a certain corpus or information stream, and combine them into a small number of functions, with the parameters weighted on basis of how useful they are for discriminating text genres. An application to information retrieval is discussed.