Web and corpus methods for Malay count classifier prediction

  • Authors:
  • Jeremy Nicholson;Timothy Baldwin

  • Affiliations:
  • University of Melbourne, VIC, Australia;University of Melbourne, VIC, Australia

  • Venue:
  • NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We examine the capacity of Web and corpus frequency methods to predict preferred count classifiers for nouns in Malay. The observed F-score for the Web model of 0.671 considerably outperformed corpus-based frequency and machine learning models. We expect that this is a fruitful extension for Web-as-corpus approaches to lexicons in languages other than English, but further research is required in other South-East and East Asian languages.