Randomized rule selection in transformation-based learning: a comparative study

Authors:
Sandra Carberry;K. Vijay-Shanker;Andrew Wilson;Ken Samuel
Affiliations:
Department of Computer Science, University of Delaware, Newark, Delaware 19716, USA/ e-mail: carberry@cis.udel.edu, vijay@cis.udel.edu, awilson@cis.udel.edu;Department of Computer Science, University of Delaware, Newark, Delaware 19716, USA/ e-mail: carberry@cis.udel.edu, vijay@cis.udel.edu, awilson@cis.udel.edu;Department of Computer Science, University of Delaware, Newark, Delaware 19716, USA/ e-mail: carberry@cis.udel.edu, vijay@cis.udel.edu, awilson@cis.udel.edu;The Mitre Corporation, Reston, VA 22090, USA/ e-mail: samuel@mitre.org
Venue:
Natural Language Engineering
Year:
2001

Citing 8
Cited 0

Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging

Computational Linguistics
Complexity of lexical descriptions and its relevance to partial parsing

Complexity of lexical descriptions and its relevance to partial parsing
An Investigation of Transformation-Based Learning in Discourse

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Lazy Transformation-Based Learning

Proceedings of the Eleventh International Florida Artificial Intelligence Research Society Conference
Dialogue act tagging with Transformation-Based Learning

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Error driven word sense disambiguation

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Efficient transformation-based parsing

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
A rule-based approach to prepositional phrase attachment disambiguation

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2

Quantified Score

Hi-index	0.00

Visualization

Abstract

Transformation-Based Learning (TBL) is a relatively new machine learning method that has achieved notable success on language problems. This paper presents a variant of TBL, called Randomized TBL, that overcomes the training time problems of standard TBL without sacrificing accuracy. It includes a set of experiments on part-of-speech tagging in which the size of the corpus and template set are varied. The results show that Randomized TBL can address problems that are intractable in terms of training time for standard TBL. In addition, for language problems such as dialogue act tagging where the most effective features have not been identified through linguistic studies, Randomized TBL allows the researcher to experiment with a large set of templates capturing many potentially useful features and feature interactions.