Discovering overlapping communities of named entities

  • Authors:
  • Xin Li;Bing Liu;Philip S. Yu

  • Affiliations:
  • Department of Computer Science, University of Illinois at Chicago, Chicago, IL;Department of Computer Science, University of Illinois at Chicago, Chicago, IL;IBM T.J. Watson Research Center, Hawthorne, NY

  • Venue:
  • PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Although community discovery based on social network analysis has been studied extensively in the Web hyperlink environment, limited research has been done in the case of named entities in text documents. The co-occurrence of entities in documents usually implies some connections among them. Investigating such connections can reveal important patterns. In this paper, we mine communities among named entities in Web documents and text corpus. Most existing works on community discovery generate a partition of the entity network, assuming each entity belongs to one community. However, in the scenario of named entities, an entity may participate in several communities. For example, a person is in the communities of his/her family, colleagues, and friends. In this paper, we propose a novel technique to mine overlapping communities of named entities. This technique is based on triangle formation, expansion, and clustering with content similarity. Our experimental results show that the proposed technique is highly effective.