On the implementation of a baseline part-of-speech tagger

Authors:
James Markham;Vasile Rus
Affiliations:
The University of Memphis, Memphis, TN;The University of Memphis, Memphis, TN
Venue:
Journal of Computing Sciences in Colleges
Year:
2005

Citing 2
Cited 0

Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging

Computational Linguistics
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II

Quantified Score

Hi-index	0.00

Visualization

Abstract

The task of part-of-speech tagging is very important for various text understanding applications including machine translation, question answering and Internet search. In this paper we present a semester-long project that aimed at efficiently implementing a baseline part-of-speech tagger while keeping costs low. This was a project for an undergraduate class in Advanced Data Structures in which we adopted a design with re-use method. The approach allowed the delivery of the software on time and with the desired functionality. It also allowed us to focus on design and conceptual issues rather than on implementing data structures and related algorithms, such as TreeMaps, already available in free libraries.