Randomized rule selection in transformation-based learning: a comparative study

  • Authors:
  • Sandra Carberry;K. Vijay-Shanker;Andrew Wilson;Ken Samuel

  • Affiliations:
  • Department of Computer Science, University of Delaware, Newark, Delaware 19716, USA/ e-mail: carberry@cis.udel.edu, vijay@cis.udel.edu, awilson@cis.udel.edu;Department of Computer Science, University of Delaware, Newark, Delaware 19716, USA/ e-mail: carberry@cis.udel.edu, vijay@cis.udel.edu, awilson@cis.udel.edu;Department of Computer Science, University of Delaware, Newark, Delaware 19716, USA/ e-mail: carberry@cis.udel.edu, vijay@cis.udel.edu, awilson@cis.udel.edu;The Mitre Corporation, Reston, VA 22090, USA/ e-mail: samuel@mitre.org

  • Venue:
  • Natural Language Engineering
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

Transformation-Based Learning (TBL) is a relatively new machine learning method that has achieved notable success on language problems. This paper presents a variant of TBL, called Randomized TBL, that overcomes the training time problems of standard TBL without sacrificing accuracy. It includes a set of experiments on part-of-speech tagging in which the size of the corpus and template set are varied. The results show that Randomized TBL can address problems that are intractable in terms of training time for standard TBL. In addition, for language problems such as dialogue act tagging where the most effective features have not been identified through linguistic studies, Randomized TBL allows the researcher to experiment with a large set of templates capturing many potentially useful features and feature interactions.