The merge/purge problem for large databases
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Adaptive detection of approximately duplicate database records and the database integration approach to information discovery
An introduction to support Vector Machines: and other kernel-based learning methods
An introduction to support Vector Machines: and other kernel-based learning methods
Efficient clustering of high-dimensional data sets with application to reference matching
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Automatic segmentation of text into structured records
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Learning object identification rules for information integration
Information Systems - Data extraction, cleaning and reconciliation
The DBLP Computer Science Bibliography: Evolution, Research Issues, Perspectives
SPIRE 2002 Proceedings of the 9th International Symposium on String Processing and Information Retrieval
Interactive deduplication using active learning
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Text joins in an RDBMS for web data integration
WWW '03 Proceedings of the 12th international conference on World Wide Web
Robust and efficient fuzzy match for online data cleaning
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Identifying and Merging Related Bibliographic Records
Identifying and Merging Related Bibliographic Records
Two supervised learning approaches for name disambiguation in author citations
Proceedings of the 4th ACM/IEEE-CS joint conference on Digital libraries
Adaptive Name Matching in Information Integration
IEEE Intelligent Systems
Eliminating fuzzy duplicates in data warehouses
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Effective and scalable solutions for mixed and split citation problems in digital libraries
Proceedings of the 2nd international workshop on Information quality in information systems
Automatic categorization of figures in scientific documents
Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries
Web Appearance Disambiguation of Personal Names Based on Network Motif
WI '06 Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence
Adaptive sorted neighborhood methods for efficient record linkage
Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
Adaptive graphical approach to entity resolution
Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
A novel approach to clustering merchandise records
Journal of Computer Science and Technology
Proceedings of the 9th annual ACM international workshop on Web information and data management
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Communications of the ACM
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Using web information for creating publication venue authority files
Proceedings of the 8th ACM/IEEE-CS joint conference on Digital libraries
A unified approach for schema matching, coreference and canonicalization
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
MyCites: An Intelligent Information System for Maintaining Citations
SETN '08 Proceedings of the 5th Hellenic conference on Artificial Intelligence: Theories, Models and Applications
Author name disambiguation in MEDLINE
ACM Transactions on Knowledge Discovery from Data (TKDD)
Disambiguating authors in academic publications using random forests
Proceedings of the 9th ACM/IEEE-CS joint conference on Digital libraries
Using web information for author name disambiguation
Proceedings of the 9th ACM/IEEE-CS joint conference on Digital libraries
Multivariable stream data classification using motifs and their temporal relations
Information Sciences: an International Journal
HARRA: fast iterative hashed record linkage for large-scale data collections
Proceedings of the 13th International Conference on Extending Database Technology
SyGAR: a synthetic data generator for evaluating name disambiguation methods
ECDL'09 Proceedings of the 13th European conference on Research and advanced technology for digital libraries
Effective self-training author name disambiguation in scholarly digital libraries
Proceedings of the 10th annual joint conference on Digital libraries
SocialSearch: enhancing entity search with social network matching
Proceedings of the 14th International Conference on Extending Database Technology
Journal of the American Society for Information Science and Technology
Eliminating the redundancy in blocking-based entity resolution methods
Proceedings of the 11th annual international ACM/IEEE joint conference on Digital libraries
A classification framework for disambiguating web people search result using feedback
WAIM'11 Proceedings of the 12th international conference on Web-age information management
Metadata enrichment via topic models for author name disambiguation
NLP4DL'09/AT4DL'09 Proceedings of the 2009 international conference on Advanced language technologies for digital libraries
Did they notice? - a case-study on the community contribution to data quality in DBLP
TPDL'11 Proceedings of the 15th international conference on Theory and practice of digital libraries: research and advanced technology for digital libraries
Applied Intelligence
Combining machine learning and human judgment in author disambiguation
Proceedings of the 20th ACM international conference on Information and knowledge management
Multivariate stream data classification using simple text classifiers
DEXA'06 Proceedings of the 17th international conference on Database and Expert Systems Applications
Cost-effective on-demand associative author name disambiguation
Information Processing and Management: an International Journal
A brief survey of automatic methods for author name disambiguation
ACM SIGMOD Record
A relevance feedback approach for the author name disambiguation problem
Proceedings of the 13th ACM/IEEE-CS joint conference on Digital libraries
Contextual rule-based feature engineering for author-paper identification
Proceedings of the 2013 KDD Cup 2013 Workshop
SocialSearch+: enriching social network with web evidences
World Wide Web
Hi-index | 0.02 |
In this paper, we consider the problem of ambiguous author names in bibliographic citations, and comparatively study alternative approaches to identify and correct such name variants (e.g., "Vannevar Bush" and "V. Vush"). Our study is based on a scalable two-step framework, where step 1 is to substantially reduce the number of candidates via blocking, and step 2 is to measure the distance of two names via coauthor information. Combining four blocking methods and seven distance measures on four data sets, we present extensive experimental results, and identify combinations that are scalable and effective to disambiguate author names in citations.