An automatic method for generating sense tagged corpora
AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
An Iterative Approach to Word Sense Disambiguation
Proceedings of the Thirteenth International Florida Artificial Intelligence Research Society Conference
Using corpus statistics and WordNet relations for sense identification
Computational Linguistics - Special issue on word sense disambiguation
A simple approach to building ensembles of Naive Bayesian classifiers for word sense disambiguation
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Exploiting parallel texts for word sense disambiguation: an empirical study
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
HLT '93 Proceedings of the workshop on Human Language Technology
A chinese corpus with word sense annotation
ICCPOL'06 Proceedings of the 21st international conference on Computer Processing of Oriental Languages: beyond the orient: the research challenges ahead
Hi-index | 0.00 |
This paper describes our infrequent sense identification system participating in the SemEval-2010 task 15 on Infrequent Sense Identification for Mandarin Text to Speech Systems. The core system is a supervised system based on the ensembles of Naïve Bayesian classifiers. In order to solve the problem of unbalanced sense distribution, we intentionally extract only instances of infrequent sense with the same N-gram pattern as the complement training data from an untagged Chinese corpus -- People's Daily of the year 2001. At the same time, we adjusted the prior probability to adapt to the distribution of the test data and tuned the smoothness coefficient to take the data sparseness into account. Official result shows that, our system ranked the first with the best Macro Accuracy 0.952. We briefly describe this system, its configuration options and the features used for this task and present some discussion of the results.