Classifier assignment by corpus-based approach

Authors:
Virach Sornlertlamvanich;Wantanee Pantachat;Surapant Meknavin
Affiliations:
Ministry of Science Technology and Environment, Bangkok, Thailand;Ministry of Science Technology and Environment, Bangkok, Thailand;Ministry of Science Technology and Environment, Bangkok, Thailand
Venue:
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
Year:
1994

Citing 2
Cited 5

Retrieving collocations from text: Xtract

Computational Linguistics - Special issue on using large corpora: I
Co-occurrence patterns among collocations: a tool for corpus-based lexical knowledge acquisition

Computational Linguistics

Reusing an ontology to generate numeral classifiers

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Classifiers in Japanese-to-English machine translation

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Corpus-based generation of numeral classifier using phrase alignment

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Infrastructure for standardization of Asian language resources

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Web and corpus methods for Malay count classifier prediction

NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents an algorithm for selecting an appropriate classifier word for a noun. In Thai language, it frequently happens that there is fluctuation in the choice of classifier for a given concrete noun, both from the point of view of the whole speech community and individual speakers. Basically, there is no exact rule for classifier selection. As far as we can do in the rule-based approach is to give a default rule to pick up a corresponding classifier of each noun. Registration of classifier for each noun is limited to the type of unit classifier because other types are open due to the meaning of representation. We propose a corpus-based method (Biber, 1993; Nagao, 1993; Smadja, 1993) which generates Noun Classifier Associations (NCA) to overcome the problems in classifier assignment and semantic construction of noun phrase. The NCA is created statistically from a large corpus and recomposed under concept hierarchy constraints and frequency of occurrences.