Automatic natural language style classification and transformation

  • Authors:
  • Foaad Khosmood;Robert A. Levinson

  • Affiliations:
  • University of California Santa Cruz, Department of Computer Science, Santa Cruz, CA;University of California Santa Cruz, Department of Computer Science, Santa Cruz, CA

  • Venue:
  • IRSG'08 Proceedings of the 2008 BCS-IRSG conference on Corpus Profiling
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Style is an integral part of natural language in written, spoken or machine generated forms. Humans have been dealing with style in language since the beginnings of language itself, but computers and machine processes have only recently begun to process natural language styles. Automatic processing of styles poses two interrelated challenges: classification and transformation. There have been recent advances in corpus classification, automatic clustering and authorship attribution along many dimensions but little work directly related to writing styles directly and even less in transformation. In this paper we examine relevant literature to define and operationalize a notion of "style" which we employ to designate style markers usable in classification machines. A measurable reading of these markers also helps guide style transformation algorithms. We demonstrate the concept by showing a detectable stylistic shift in a sample piece of text relative to a target corpus. We present ongoing work in building a comprehensive style recognition and transformation system and discuss our results.