Scatter/Gather: a cluster-based approach to browsing large document collections
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Normalized Cuts and Image Segmentation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
An Adapted Lesk Algorithm for Word Sense Disambiguation Using WordNet
CICLing '02 Proceedings of the Third International Conference on Computational Linguistics and Intelligent Text Processing
Discovering word senses from text
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
On clusterings-good, bad and spectral
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Automatic word sense discrimination
Computational Linguistics - Special issue on word sense disambiguation
Ontology mapping: the state of the art
The Knowledge Engineering Review
Entity-based cross-document coreferencing using the Vector Space Model
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Proceedings of the 13th international conference on World Wide Web
Towards the self-annotating web
Proceedings of the 13th international conference on World Wide Web
Disambiguating Web appearances of people in a social network
WWW '05 Proceedings of the 14th international conference on World Wide Web
Name disambiguation in author citations using a K-way spectral clustering method
Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries
A testbed for people searching strategies in the WWW
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Semantic integration in text: from ambiguous names to identifiable entities
AI Magazine - Special issue on semantic integration
Unsupervised personal name disambiguation
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
POLYPHONET: an advanced social network extraction system from the web
Proceedings of the 15th international conference on World Wide Web
Finding predominant word senses in untagged text
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Generative models for name disambiguation
Proceedings of the 16th international conference on World Wide Web
Introduction to Information Retrieval
Introduction to Information Retrieval
Unsupervised Discrimination of Person Names in Web Contexts
CICLing '07 Proceedings of the 8th International Conference on Computational Linguistics and Intelligent Text Processing
Design challenges and misconceptions in named entity recognition
CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
Spinning multiple social networks for semantic web
AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Is Hillary Rodham Clinton the president?: disambiguating names across documents
CorefApp '99 Proceedings of the Workshop on Coreference and its Applications
Semeval-2007 task 02: evaluating word sense induction and discrimination systems
SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
Semantic precision and recall for ontology alignment evaluation
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Improving author coreference by resource-bounded information gathering from the web
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Flink: Semantic Web technology for the extraction and analysis of social networks
Web Semantics: Science, Services and Agents on the World Wide Web
Efficient name disambiguation for large-scale databases
PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
CICLing'06 Proceedings of the 7th international conference on Computational Linguistics and Intelligent Text Processing
A method for learning part-whole relations
ISWC'06 Proceedings of the 5th international conference on The Semantic Web
Name discrimination by clustering similar contexts
CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing
Hi-index | 0.00 |
Personal name disambiguation is an important task in social network extraction, evaluation and integration of ontologies, information retrieval, cross-document coreference resolution and word sense disambiguation. We propose an unsupervised method to automatically annotate people with ambiguous names on the Web using automatically extracted keywords. Given an ambiguous personal name, first, we download text snippets for the given name from a Web search engine. We then represent each instance of the ambiguous name by a term-entity model (TEM), a model that we propose to represent the Web appearance of an individual. A TEM of a person captures named entities and attribute values that are useful to disambiguate that person from his or her namesakes (i.e., different people who share the same name). We then use group average agglomerative clustering to identify the instances of an ambiguous name that belong to the same person. Ideally, each cluster must represent a different namesake. However, in practice it is not possible to know the number of namesakes for a given ambiguous personal name in advance. To circumvent this problem, we propose a novel normalized cuts-based cluster stopping criterion to determine the different people on the Web for a given ambiguous name. Finally, we annotate each person with an ambiguous name using keywords selected from the clusters. We evaluate the proposed method on a data set of over 2500 documents covering 200 different people for 20 ambiguous names. Experimental results show that the proposed method outperforms numerous baselines and previously proposed name disambiguation methods. Moreover, the extracted keywords reduce ambiguity of a name in an information retrieval task, which underscores the usefulness of the proposed method in real-world scenarios. © 2012 Wiley Periodicals, Inc.