Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
Entity-based cross-document coreferencing using the Vector Space Model
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Grouping search-engine returned citations for person-name queries
Proceedings of the 6th annual ACM international workshop on Web information and data management
Unsupervised personal name disambiguation
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Automatic cluster stopping with criterion functions and the gap statistic
NAACL-Demonstrations '06 Proceedings of the 2006 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: companion volume: demonstrations
Efficient topic-based unsupervised name disambiguation
Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
How many different "John Smiths", and who are they?
AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Person name disambiguation in web pages using social network, compound words and latent topics
PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
Superficial method for extracting social network for academics using web snippets
RSKT'10 Proceedings of the 5th international conference on Rough set and knowledge technology
Hi-index | 0.00 |
Focusing on multi-document personal name disambiguation, this paper develops an agglomerative clustering approach to resolving this problem. We start from an analysis of point-wise mutual information between feature and the ambiguous name, which brings about a novel weight computing method for feature in clustering. Then a trade-off measure between within-cluster compactness and among-cluster separation is proposed for stopping clustering. After that, we apply a labeling method to find representative feature for each cluster. Finally, experiments are conducted on word-based clustering in Chinese dataset and the result shows a good effect.