A cascaded classification approach to disambiguating polysemous mentions with social chains

  • Authors:
  • Yu-Chuan Wei;Ming-Shun Lin;Hsin-Hsi Chen

  • Affiliations:
  • Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan;Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan;Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan

  • Venue:
  • Expert Systems with Applications: An International Journal
  • Year:
  • 2010

Quantified Score

Hi-index 12.05

Visualization

Abstract

This paper considers five features including titles, community chains, terms, temporal expressions, and hostnames for personal name disambiguation. In nine test data sets covering three ambiguous personal names, we address the issues of awareness degree of an entity, the source of materials and web pages in different areas. In a single-clusterer approach, employing all features achieve average F-score 0.635, which is better than employing contextual terms only 0.502. When community chains are expanded by using the web, the average F-score is increased to 0.676. We also propose a multiple-clusterer approach, which cascades five clusterers corresponding to the five features. The average F-score is further improved to 0.684. Expanding community chains also enhances the average F-score of the multiple-clusterer approach to 0.697. In summary, the proposed features are quite useful; the cascaded multiple-clusterer approach is better than the single-clusterer approach; and expanding community chains using the web has positive effects on personal name disambiguation. The experiments show that this approach has significant improvements.