Improving the performance of personal name disambiguation using web directories

Authors:
Quang Minh Vu;Atsuhiro Takasu;Jun Adachi
Affiliations:
The University of Tokyo, Graduate School of Information Science and Technology, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan and National Institute of Informatics, 2-1-2 Hitotsubashi, Chiyoda-ku, ...;National Institute of Informatics, 2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo 101-8430, Japan;National Institute of Informatics, 2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo 101-8430, Japan
Venue:
Information Processing and Management: an International Journal
Year:
2008

Citing 16
Cited 5

Foundations of statistical natural language processing

Foundations of statistical natural language processing
The feature quantity: an information theoretic perspective of Tfidf-like measures

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Document clustering using word clusters via the information bottleneck method

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Modern Information Retrieval

Modern Information Retrieval
Entity-based cross-document coreferencing using the Vector Space Model

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Two supervised learning approaches for name disambiguation in author citations

Proceedings of the 4th ACM/IEEE-CS joint conference on Digital libraries
Disambiguating Web appearances of people in a social network

WWW '05 Proceedings of the 14th international conference on World Wide Web
University of Pennsylvania: description of the University of Pennsylvania system used for MUC-6

MUC6 '95 Proceedings of the 6th conference on Message understanding
Learning surface text patterns for a Question Answering system

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Person resolution in person search results: WebHawk

Proceedings of the 14th ACM international conference on Information and knowledge management
Unsupervised personal name disambiguation

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Using a knowledge base to disambiguate personal name in web search results

Proceedings of the 2007 ACM symposium on Applied computing
Using Web Directories for Similarity Measurement in Personal Name Disambiguation

AINAW '07 Proceedings of the 21st International Conference on Advanced Information Networking and Applications Workshops - Volume 01
Extracting key phrases to disambiguate personal name queries in web search

CLIIR '06 Proceedings of the Workshop on How Can Computational Linguistics Improve Information Retrieval?
Name discrimination by clustering similar contexts

CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing
Self organization of a massive document collection

IEEE Transactions on Neural Networks

Name Disambiguation Boosted by Latent Topics from Web Directories

WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
A Versatile Record Linkage Method by Term Matching Model Using CRF

DEXA '09 Proceedings of the 20th International Conference on Database and Expert Systems Applications
Clustering web people search results using fuzzy ants

Information Sciences: an International Journal
Exploiting Web querying for Web people search

ACM Transactions on Database Systems (TODS)
Disambiguating authors in citations on the web and authorship correlations

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Frequent requests from users to search engines on the World Wide Web are to search for information about people using personal names. Current search engines only return sets of documents containing the name queried, but, as several people usually share a personal name, the resulting sets often contain documents relevant to several people. It is necessary to disambiguate people in these result sets in order to to help users find the person of interest more readily. In the task of name disambiguation, effective measurement of similarities in the documents is a crucial step towards the final disambiguation. We propose a new method that uses web directories as a knowledge base to find common contexts in documents and uses the common contexts measure to determine document similarities. Experiments, conducted on documents mentioning real people on the web, together with several famous web directory structures, suggest that there are significant advantages in using web directories to disambiguate people compared with other conventional methods.