The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Authoritative sources in a hyperlinked environment
Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Discovering corpus-specific word senses
EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 2
Discovering relations among named entities from large corpora
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Hi-index | 0.00 |
Although community discovery has been studied extensively in the Web environment, limited research has been done in the case of free text. Co-occurrence of words and entities in sentences and documents usually implies connections among them. In this paper, we investigate the co-occurrences of named entities in text, and mine communities among these entities. We show that identifying communities from free text can be transformed into a graph clustering problem. A hierarchical clustering algorithm is then proposed. Our experiment shows that the algorithm is effective to discover named entity communities from text documents.