Entity set expansion using topic information

  • Authors:
  • Kugatsu Sadamitsu;Kuniko Saito;Kenji Imamura;Genichiro Kikui

  • Affiliations:
  • NTT Cyber Space Laboratories, NTT Corporation, Hikarinooka, Yokosuka-shi, Kanagawa, Japan;NTT Cyber Space Laboratories, NTT Corporation, Hikarinooka, Yokosuka-shi, Kanagawa, Japan;NTT Cyber Space Laboratories, NTT Corporation, Hikarinooka, Yokosuka-shi, Kanagawa, Japan;NTT Cyber Space Laboratories, NTT Corporation, Hikarinooka, Yokosuka-shi, Kanagawa, Japan

  • Venue:
  • HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper proposes three modules based on latent topics of documents for alleviating "semantic drift" in bootstrapping entity set expansion. These new modules are added to a discriminative bootstrapping algorithm to realize topic feature generation, negative example selection and entity candidate pruning. In this study, we model latent topics with LDA (Latent Dirichlet Allocation) in an unsupervised way. Experiments show that the accuracy of the extracted entities is improved by 6.7 to 28.2% depending on the domain.