A web service for automatic word class acquisition

  • Authors:
  • Stijn De Saeger;Jun'ichi Kazama;Kentaro Torisawa;Masaki Murata;Ichiro Yamada;Kow Kuroda

  • Affiliations:
  • National Institute of Information and Communications Technology (NICT), Seikacho, Kyoto, Japan;National Institute of Information and Communications Technology (NICT), Seikacho, Kyoto, Japan;National Institute of Information and Communications Technology (NICT), Seikacho, Kyoto, Japan;National Institute of Information and Communications Technology (NICT), Seikacho, Kyoto, Japan;National Institute of Information and Communications Technology (NICT), Seikacho, Kyoto, Japan;National Institute of Information and Communications Technology (NICT), Seikacho, Kyoto, Japan

  • Venue:
  • Proceedings of the 3rd International Universal Communication Symposium
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we present a Web service for building NLP resources to construct semantic word classes in Japanese. The system takes a few seed words belonging to the target class as input and uses automatic class expansion to suggest semantically similar training samples for the user to label. The system automatically generates random negative training samples as well, and then trains a supervised classifier on this labeled data to generate the target word class from 107 candidate words extracted from a corpus of of 108 Web documents. This system eliminates the need for expert machine learning knowledge in creating semantic word classes, and we experimentally show that it significantly reduces the human effort required to build them.