Fast and effective text mining using linear-time document clustering
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Text Classification from Labeled and Unlabeled Documents using EM
Machine Learning - Special issue on information retrieval
Entity-based cross-document coreferencing using the Vector Space Model
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Disambiguating Web appearances of people in a social network
WWW '05 Proceedings of the 14th international conference on World Wide Web
Coreference for NLP applications
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Person resolution in person search results: WebHawk
Proceedings of the 14th ACM international conference on Information and knowledge management
Unsupervised personal name disambiguation
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
The SemEval-2007 WePS evaluation: establishing a benchmark for the web people search task
SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
Inferring parameters and structure of latent variable models by variational bayes
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
NAYOSE: a system for reference disambiguation of proper nouns appearing on web pages
AIRS'06 Proceedings of the Third Asia conference on Information Retrieval Technology
Clustering technique in multi-document personal name disambiguation
ACLstudent '09 Proceedings of the ACL-IJCNLP 2009 Student Research Workshop
Person name disambiguation by bootstrapping
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Exploiting Web querying for Web people search
ACM Transactions on Database Systems (TODS)
Hi-index | 0.00 |
The World Wide Web (WWW) provides much information about persons, and in recent years WWW search engines have been commonly used for learning about persons. However, many persons have the same name and that ambiguity typically causes the search results of one person name to include Web pages about several different persons. We propose a novel framework for person name disambiguation that has the following three components processes. Extraction of social network information by finding co-occurrences of named entities, Measurement of document similarities based on occurrences of key compound words, Inference of topic information from documents based on the Dirichlet process unigram mixture model. Experiments using an actual Web document dataset show that the result of our framework is promising.