A maximum entropy approach to natural language processing
Computational Linguistics
Detecting errors within a corpus using anomaly detection
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Word extraction from corpora and its part-of-speech estimation using distributional analysis
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Chunking with support vector machines
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Morphological analysis of a large spontaneous speech corpus in Japanese
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Hi-index | 0.00 |
We propose an efficient framework for humanaided morphological annotation of a large spontaneous speech corpus such as the Corpus of Spontaneous Japanese. In this framework, even when word units have several definitions in a given corpus, and not all words are found in a dictionary or in a training corpus, we can morphologically analyze the given corpus with high accuracy and low labor costs by detecting words not found in the dictionary and putting them into it. We can further reduce labor costs by expanding training corpora based on active learning.