Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
The dynamics of collective sorting robot-like ants and ant-like robots
Proceedings of the first international conference on simulation of adaptive behavior on From animals to animats
The ATIS spoken language systems pilot corpus
HLT '90 Proceedings of the workshop on Speech and Natural Language
C4.5: programs for machine learning
C4.5: programs for machine learning
Diversity and adaptation in populations of clustering ants
SAB94 Proceedings of the third international conference on Simulation of adaptive behavior : from animals to animats 3: from animals to animats 3
ROCK: a robust clustering algorithm for categorical attributes
Information Systems
Document clustering based on non-negative matrix factorization
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
Entity-based cross-document coreferencing using the Vector Space Model
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Learning surface text patterns for a Question Answering system
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Unsupervised personal name disambiguation
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Memory-Based Language Processing (Studies in Natural Language Processing)
Memory-Based Language Processing (Studies in Natural Language Processing)
Model-based evaluation of clustering validation measures
Pattern Recognition
Clustering web search results using fuzzy ants: Research Articles
International Journal of Intelligent Systems
A cluster validity index for fuzzy clustering
Information Sciences: an International Journal
Improving the performance of personal name disambiguation using web directories
Information Processing and Management: an International Journal
Introduction to Information Retrieval
Introduction to Information Retrieval
A comparison of extrinsic clustering evaluation metrics based on formal constraints
Information Retrieval
Disambiguating Personal Names on the Web using Automatically Extracted Key Phrases
Proceedings of the 2006 conference on ECAI 2006: 17th European Conference on Artificial Intelligence August 29 -- September 1, 2006, Riva del Garda, Italy
The SemEval-2007 WePS evaluation: establishing a benchmark for the web people search task
SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
AUG: a combined classification and clustering approach for web people disambiguation
SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
A new separation measure for improving the effectiveness of validity indices
Information Sciences: an International Journal
A novel ant-based clustering algorithm using the kernel method
Information Sciences: an International Journal
Information Sciences: an International Journal
Hi-index | 0.07 |
Person name queries often bring up web pages that correspond to individuals sharing the same name. The Web People Search (WePS) task consists of organizing search results for ambiguous person name queries into meaningful clusters, with each cluster referring to one individual. This paper presents a fuzzy ant based clustering approach for this multi-document person name disambiguation problem. The main advantage of fuzzy ant based clustering, a technique inspired by the behavior of ants clustering dead nestmates into piles, is that no specification of the number of output clusters is required. This makes the algorithm very well suited for the Web Person Disambiguation task, where we do not know in advance how many individuals each person name refers to. We compare our results with state-of-the-art partitional and hierarchical clustering approaches (k-means and Agnes) and demonstrate favorable results. This is particularly interesting as the latter involve manual setting of a similarity threshold, or estimating the number of clusters in advance, while the fuzzy ant based clustering algorithm does not.