Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
Two supervised learning approaches for name disambiguation in author citations
Proceedings of the 4th ACM/IEEE-CS joint conference on Digital libraries
Name disambiguation in author citations using a K-way spectral clustering method
Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries
A hierarchical naive Bayes mixture model for name disambiguation in author citations
Proceedings of the 2005 ACM symposium on Applied computing
Collective entity resolution in relational data
ACM Transactions on Knowledge Discovery from Data (TKDD)
Efficient topic-based unsupervised name disambiguation
Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
Author Name Disambiguation for Citations Using Topic and Web Correlation
ECDL '08 Proceedings of the 12th European conference on Research and Advanced Technology for Digital Libraries
On co-authorship for author disambiguation
Information Processing and Management: an International Journal
Improving author coreference by resource-bounded information gathering from the web
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
The WEKA data mining software: an update
ACM SIGKDD Explorations Newsletter
Efficient name disambiguation for large-scale databases
PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
Cost-effective on-demand associative author name disambiguation
Information Processing and Management: an International Journal
Citation-based bootstrapping for large-scale author disambiguation
Journal of the American Society for Information Science and Technology
Hi-index | 0.00 |
The name ambiguity problem is especially challenging in the field of bibliographic digital libraries. The problem is amplified when names are collected from heterogeneous sources. This is the case in the Scholarometer system, which performs bibliometric analysis by cross-correlating author names in user queries with those retrieved from digital libraries. The uncontrolled nature of user-generated annotations is very valuable, but creates the need to detect ambiguous names. Our goal is to detect ambiguous names at query time by mining digital library annotation data, thereby decreasing noise in the bibliometric analysis. We explore three kinds of heuristic features based on citations, metadata, and crowdsourced topics in a supervised learning framework. The proposed approach achieves almost 80% accuracy. Finally, we compare the performance of ambiguous author detection in Scholarometer using Google Scholar against a baseline based on Microsoft Academic Search.