Relation acquisition using word classes and partial patterns

  • Authors:
  • Stijn De Saeger;Kentaro Torisawa;Masaaki Tsuchida;Jun'ichi Kazama;Chikara Hashimoto;Ichiro Yamada;Jong Hoon Oh;István Varga;Yulan Yan

  • Affiliations:
  • Information Analysis Laboratory, National Institute of Information and Communications Technology, Kyoto, Japan;Information Analysis Laboratory, National Institute of Information and Communications Technology, Kyoto, Japan;Information and Media Processing Laboratories, NEC Corporation, Nara, Japan;Information Analysis Laboratory, National Institute of Information and Communications Technology, Kyoto, Japan;Information Analysis Laboratory, National Institute of Information and Communications Technology, Kyoto, Japan;NHK Science & Technology Research Laboratories, Tokyo, Japan;Information Analysis Laboratory, National Institute of Information and Communications Technology, Kyoto, Japan;Information Analysis Laboratory, National Institute of Information and Communications Technology, Kyoto, Japan;Information Analysis Laboratory, National Institute of Information and Communications Technology, Kyoto, Japan

  • Venue:
  • EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
  • Year:
  • 2011

Quantified Score

Hi-index 0.01

Visualization

Abstract

This paper proposes a semi-supervised relation acquisition method that does not rely on extraction patterns (e.g. "X causes Y" for causal relations) but instead learns a combination of indirect evidence for the target relation --- semantic word classes and partial patterns. This method can extract long tail instances of semantic relations like causality from rare and complex expressions in a large Japanese Web corpus --- in extreme cases, patterns that occur only once in the entire corpus. Such patterns are beyond the reach of current pattern based methods. We show that our method performs on par with state-of-the-art pattern based methods, and maintains a reasonable level of accuracy even for instances acquired from infrequent patterns. This ability to acquire long tail instances is crucial for risk management and innovation, where an exhaustive database of high-level semantic relations like causation is of vital importance.