Noun-phrase co-occurrence statistics for semiautomatic semantic lexicon construction
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
"More like these": growing entity classes from seeds
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Semi-automatic entity set refinement
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Active learning with statistical models
Journal of Artificial Intelligence Research
Automatic set instance extraction using the web
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Employing topic models for pattern-based semantic class discovery
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Entity extraction via ensemble semantics
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Web-scale distributional similarity and entity set expansion
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Hi-index | 0.00 |
In this paper we present a Web service for building NLP resources to construct semantic word classes in Japanese. The system takes a few seed words belonging to the target class as input and uses automatic class expansion to suggest semantically similar training samples for the user to label. The system automatically generates random negative training samples as well, and then trains a supervised classifier on this labeled data to generate the target word class from 107 candidate words extracted from a corpus of of 108 Web documents. This system eliminates the need for expert machine learning knowledge in creating semantic word classes, and we experimentally show that it significantly reduces the human effort required to build them.