Two stages based organization name disambiguity

  • Authors:
  • Shu Zhang;Jianwei Wu;Dequan Zheng;Yao Meng;Yingju Xia;Hao Yu

  • Affiliations:
  • Fujitsu Research and Development Center, Beijing, China;School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China;School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China;Fujitsu Research and Development Center, Beijing, China;Fujitsu Research and Development Center, Beijing, China;Fujitsu Research and Development Center, Beijing, China

  • Venue:
  • CICLing'12 Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

With the rapid growth of user generated media, Twitter has become an important information resource where users share fresh information on any subject. Pursuing on the problem of finding related tweets to a given organization, we propose two stages based organization name disambiguity. Insufficient information and the diversity of organizations are two key problems for this task. We induce multiple types of features to enrich the information of organization to solve the problem of insufficient information. The relationships between tweets and organization, the relationships among tweets are mined in two stages to solve the diversity of organization. Furthermore, we probe the distribution of organization names' ambiguity and its influence to different classifiers. Our experimental results on WePS-3 prove the proposed methods are effective and promising in performing this task.