Improving Web Clustering by Cluster Selection
WI '05 Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence
A scalable algorithm for high-quality clustering of web snippets
Proceedings of the 2006 ACM symposium on Applied computing
A New Web Search Result Clustering based on True Common Phrase Label Discovery
CIMCA '06 Proceedings of the International Conference on Computational Inteligence for Modelling Control and Automation and International Conference on Intelligent Agents Web Technologies and International Commerce
Espresso: leveraging generic patterns for automatically harvesting semantic relations
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
A new algorithm for clustering search results
Data & Knowledge Engineering
A comparison of statistical significance tests for information retrieval evaluation
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
A personalized search engine based on Web-snippet hierarchical clustering
Software—Practice & Experience
Improving Web Search by Categorization, Clustering, and Personalization
ADMA '08 Proceedings of the 4th international conference on Advanced Data Mining and Applications
A survey of Web clustering engines
ACM Computing Surveys (CSUR)
A comparison of extrinsic clustering evaluation metrics based on formal constraints
Information Retrieval
A comparison of retrieval-based hierarchical clustering approaches to person name disambiguation
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
The SemEval-2007 WePS evaluation: establishing a benchmark for the web people search task
SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
Person name disambiguation by bootstrapping
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
People searching for people: analysis of a people search engine log
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
From names to entities using thematic context distance
Proceedings of the 20th ACM international conference on Information and knowledge management
Journal of Artificial Intelligence Research
Scalable clustering methods for the name disambiguation problem
Knowledge and Information Systems
Foundations and Trends in Information Retrieval
Hi-index | 0.00 |
We study the problem of disambiguating the results of a web people search engine: given a query consisting of a person name plus the result pages for this query, find correct referents for all mentions by clustering the pages according to the different people sharing the name. While the problem has been studied extensively, we discover that the increasing availability of results retrieved from social media platforms causes state-of-the-art methods to break down. We analyze the problem and propose a dual strategy where we distinguish between results obtained from social media platforms and those obtained from other sources. In our dual strategy, the two types of documents are disambiguated separately, using different strategies, and their results are then merged. We study several instantiations for the different stages in our proposed strategy and manage to achieve state-of-the-art performance.