Web personal name disambiguation based on reference entity tables mined from the web

  • Authors:
  • Xianpei Han;Jun Zhao

  • Affiliations:
  • Chinese Academy of Sciences, Beijing, China;Chinese Academy of Sciences, Beijing, China

  • Venue:
  • Proceedings of the eleventh international workshop on Web information and data management
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Ambiguous personal names are common on the Web, which pose a challenge for many different tasks. The traditional disambiguation employs the clustering methods. However, without reference entity tables, the clustering method can only identify whether two names refer to the same entity, rather than identify which entities they refer to. Furthermore, clustering methods are difficult to achieve robust performance on different names. Some recent disambiguation methods (the link-with-entity-base methods) extract the reference entity tables from online entity bases. The link-with-entity-base methods, however, suffer from the entity base's limited coverage problem, so it can only disambiguate names in a limited coverage. In this paper, to overcome the previous methods' deficiencies, we propose a web-querying method to mine the reference entity tables from the Web automatically with the help of professional category knowledge. Then, we disambiguate personal names by linking them to the personal entities within the mined tables through categorization. The experimental results on the dataset extracted from Freebase show that our web-querying method can effectively mine personal entity with an F-measure 0.90. The disambiguation results on WePS datasets show that our method can achieve more robust and informative performance than the traditional clustering methods; and outperforms the traditional link-with-entity-base methods with a 0.29 improvement in F-measure.