Communications of the ACM
Ultraconservative online algorithms for multiclass problems
The Journal of Machine Learning Research
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
Anchoring floating quantifiers in Japanese-to-English machine translation
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Extended models and tools for high-performance part-of-speech tagger
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Word extraction from corpora and its part-of-speech estimation using distributional analysis
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Japanese case frame construction by coupling the verb and its closest case component
HLT '01 Proceedings of the first international conference on Human language technology research
Japanese Named Entity extraction with redundant morphological analysis
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Japanese dependency analysis using cascaded chunking
COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Supersense tagging of unknown nouns in WordNet
EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Supersense tagging of unknown nouns using semantic similarity
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Guessing parts-of-speech of unknown words using global information
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
An effective two-stage model for exploiting non-local dependencies in named entity recognition
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Japanese unknown word identification by character-based chunking
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Automatic construction of nominal case frames and its application to indirect anaphora resolution
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Online Passive-Aggressive Algorithms
The Journal of Machine Learning Research
Broad-coverage sense disambiguation and information extraction with a supersense sequence tagger
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Online acquisition of Japanese unknown morphemes using morphological constraints
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Capturing salience with a trainable cache model for zero-anaphora resolution
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Multi-class confidence weighted algorithms
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
A probabilistic model for associative anaphora resolution
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
Hi-index | 0.00 |
In this paper, we present a two-stage approach to acquire Japanese unknown morphemes from text with full POS tags assigned to them. We first acquire unknown morphemes only making a morphology-level distinction, and then apply semantic classification to acquired nouns. One advantage of this approach is that, at the second stage, we can exploit syntactic clues in addition to morphological ones because as a result of the first stage acquisition, we can rely on automatic parsing. Japanese semantic classification poses an interesting challenge: proper nouns need to be distinguished from common nouns. It is because lapanese has no orthographic distinction between common and proper nouns and no apparent morphosyntactic distinction between them. We explore lexico-syntactic clues that are extracted from automatically parsed text and investigate their effects.