Person name disambiguation in web pages using social network, compound words and latent topics

Authors:
Shingo Ono;Issei Sato;Minoru Yoshida;Hiroshi Nakagawa
Affiliations:
Graduate School of Information Science and Technology, The University of Tokyo;Graduate School of Information Science and Technology, The University of Tokyo;Information Technology Center, The University of Tokyo;Information Technology Center, The University of Tokyo
Venue:
PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
Year:
2008

Citing 11
Cited 3

Fast and effective text mining using linear-time document clustering

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Text Classification from Labeled and Unlabeled Documents using EM

Machine Learning - Special issue on information retrieval
Entity-based cross-document coreferencing using the Vector Space Model

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Disambiguating Web appearances of people in a social network

WWW '05 Proceedings of the 14th international conference on World Wide Web
Coreference for NLP applications

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Person resolution in person search results: WebHawk

Proceedings of the 14th ACM international conference on Information and knowledge management
Unsupervised personal name disambiguation

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Weakly supervised learning for cross-document person name disambiguation supported by information extraction

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
The SemEval-2007 WePS evaluation: establishing a benchmark for the web people search task

SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
Inferring parameters and structure of latent variable models by variational bayes

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
NAYOSE: a system for reference disambiguation of proper nouns appearing on web pages

AIRS'06 Proceedings of the Third Asia conference on Information Retrieval Technology

Clustering technique in multi-document personal name disambiguation

ACLstudent '09 Proceedings of the ACL-IJCNLP 2009 Student Research Workshop
Person name disambiguation by bootstrapping

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Exploiting Web querying for Web people search

ACM Transactions on Database Systems (TODS)

Quantified Score

Hi-index	0.00

Visualization

Abstract

The World Wide Web (WWW) provides much information about persons, and in recent years WWW search engines have been commonly used for learning about persons. However, many persons have the same name and that ambiguity typically causes the search results of one person name to include Web pages about several different persons. We propose a novel framework for person name disambiguation that has the following three components processes. Extraction of social network information by finding co-occurrences of named entities, Measurement of document similarities based on occurrences of key compound words, Inference of topic information from documents based on the Dirichlet process unigram mixture model. Experiments using an actual Web document dataset show that the result of our framework is promising.