Machine Learning
Machine Learning
The Journal of Machine Learning Research
BDBComp: building a digital library for the Brazilian computer science community
Proceedings of the 4th ACM/IEEE-CS joint conference on Digital libraries
Two supervised learning approaches for name disambiguation in author citations
Proceedings of the 4th ACM/IEEE-CS joint conference on Digital libraries
The author-topic model for authors and documents
UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Journal of the American Society for Information Science and Technology
Disambiguating Web appearances of people in a social network
WWW '05 Proceedings of the 14th international conference on World Wide Web
Name disambiguation in author citations using a K-way spectral clustering method
Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries
A hierarchical naive Bayes mixture model for name disambiguation in author citations
Proceedings of the 2005 ACM symposium on Applied computing
Effective and scalable solutions for mixed and split citation problems in digital libraries
Proceedings of the 2nd international workshop on Information quality in information systems
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Collective entity resolution in relational data
ACM Transactions on Knowledge Discovery from Data (TKDD)
Using a knowledge base to disambiguate personal name in web search results
Proceedings of the 2007 ACM symposium on Applied computing
Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
Efficient topic-based unsupervised name disambiguation
Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
A parallel general-purpose synthetic data generator
ACM SIGMOD Record
Approximate personal name-matching through finite-state graphs
Journal of the American Society for Information Science and Technology
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Author Name Disambiguation for Citations Using Topic and Web Correlation
ECDL '08 Proceedings of the 12th European conference on Research and Advanced Technology for Digital Libraries
On co-authorship for author disambiguation
Information Processing and Management: an International Journal
Accurate Synthetic Generation of Realistic Personal Information
PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Author name disambiguation in MEDLINE
ACM Transactions on Knowledge Discovery from Data (TKDD)
Disambiguating authors in academic publications using random forests
Proceedings of the 9th ACM/IEEE-CS joint conference on Digital libraries
Using web information for author name disambiguation
Proceedings of the 9th ACM/IEEE-CS joint conference on Digital libraries
The SemEval-2007 WePS evaluation: establishing a benchmark for the web people search task
SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
Improving author coreference by resource-bounded information gathering from the web
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
SyGAR: a synthetic data generator for evaluating name disambiguation methods
ECDL'09 Proceedings of the 13th European conference on Research and advanced technology for digital libraries
Effective self-training author name disambiguation in scholarly digital libraries
Proceedings of the 10th annual joint conference on Digital libraries
So near and yet so far: New insight into properties of some well-known classifier paradigms
Information Sciences: an International Journal
Person name disambiguation by bootstrapping
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Journal of the American Society for Information Science and Technology
On Graph-Based Name Disambiguation
Journal of Data and Information Quality (JDIQ)
Annual Review of Information Science and Technology
Journal of the American Society for Information Science and Technology
LIBSVM: A library for support vector machines
ACM Transactions on Intelligent Systems and Technology (TIST)
A generic Web-based entity resolution framework
Journal of the American Society for Information Science and Technology
Construction of a large-scale test set for author disambiguation
Information Processing and Management: an International Journal
Calibrated lazy associative classification
Information Sciences: an International Journal
Resolving author name homonymy to improve resolution of structures in co-author networks
Proceedings of the 11th annual international ACM/IEEE joint conference on Digital libraries
Efficient name disambiguation for large-scale databases
PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
Probabilistic data generation for deduplication and data linkage
IDEAL'05 Proceedings of the 6th international conference on Intelligent Data Engineering and Automated Learning
Cost-effective on-demand associative author name disambiguation
Information Processing and Management: an International Journal
An automatic system for identifying authorities in digital libraries
Expert Systems with Applications: An International Journal
Hi-index | 0.07 |
The author name disambiguation task has to deal with uncertainties related to the possible many-to-many correspondences between ambiguous names and unique authors. Despite the variety of name disambiguation methods available in the literature to solve the problem, most of them are rarely compared against each other. Moreover, they are often evaluated without considering a time evolving digital library, susceptible to dynamic (and therefore challenging) patterns such as the introduction of new authors and the change of researchers' interests over time. In order to facilitate the evaluation of name disambiguation methods in various realistic scenarios and under controlled conditions, in this article we propose SyGAR, a new Synthetic Generator of Authorship Records that generates citation records based on author profiles. SyGAR can be used to generate successive loads of citation records simulating a living digital library that evolves according to various publication patterns. We validate SyGAR by comparing the results produced by three representative name disambiguation methods on real as well as synthetically generated collections of citation records. We also demonstrate its applicability by evaluating those methods on a time evolving digital library collection generated with the tool, considering several dynamic and realistic scenarios.