A probabilistic similarity metric for Medline records: A model for author name disambiguation: Research Articles

Authors:
Vetle I. Torvik;Marc Weeber;Don R. Swanson;Neil R. Smalheiser
Affiliations:
Department of Psychiatry (MC912), University of Illinois, Chicago, 1601 W. Taylor Street, Chicago, IL 60612;Department of Psychiatry (MC912), University of Illinois, Chicago, 1601 W. Taylor Street, Chicago, IL 60612;Division of the Humanities, University of Chicago, Chicago, IL 60637;Department of Psychiatry (MC912), University of Illinois, Chicago, 1601 W. Taylor Street, Chicago, IL 60612
Venue:
Journal of the American Society for Information Science and Technology
Year:
2005

Citing 0
Cited 25

Stylistic text classification using functional lexical features: Research Articles

Journal of the American Society for Information Science and Technology
Phenomenon and manifestation of the `Author's Effect of Showcasing' (AES): a literature science study, I. Emergence, causes and traces of the phenomenon in the literature, perception and notion of the effect

Journal of Information Science
Entity matching across heterogeneous data sources: An approach based on constrained cascade generalization

Data & Knowledge Engineering
A unified approach for schema matching, coreference and canonicalization

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
On co-authorship for author disambiguation

Information Processing and Management: an International Journal
Arrowsmith two-node search interface: A tutorial on finding meaningful links between two disparate sets of articles in MEDLINE

Computer Methods and Programs in Biomedicine
Author name disambiguation in MEDLINE

ACM Transactions on Knowledge Discovery from Data (TKDD)
Disambiguating authors in academic publications using random forests

Proceedings of the 9th ACM/IEEE-CS joint conference on Digital libraries
Intelligent hybrid approach to false identity detection

Proceedings of the 12th International Conference on Artificial Intelligence and Law
Effective self-training author name disambiguation in scholarly digital libraries

Proceedings of the 10th annual joint conference on Digital libraries
Disclosing false identity through hybrid link analysis

Artificial Intelligence and Law
A heuristic approach to author name disambiguation in bibliometrics databases for large-scale research assessments

Journal of the American Society for Information Science and Technology
Resolving author name homonymy to improve resolution of structures in co-author networks

Proceedings of the 11th annual international ACM/IEEE joint conference on Digital libraries
Metadata enrichment via topic models for author name disambiguation

NLP4DL'09/AT4DL'09 Proceedings of the 2009 international conference on Advanced language technologies for digital libraries
Ontology-driven automatic entity disambiguation in unstructured text

ISWC'06 Proceedings of the 5th international conference on The Semantic Web
The arrowsmith project: 2005 status report

DS'05 Proceedings of the 8th international conference on Discovery Science
Author name disambiguation for ranking and clustering pubmed data using netclus

AI'11 Proceedings of the 24th international conference on Advances in Artificial Intelligence
Cost-effective on-demand associative author name disambiguation

Information Processing and Management: an International Journal
A tool for generating synthetic authorship records for evaluating author name disambiguation methods

Information Sciences: an International Journal
Active associative sampling for author name disambiguation

Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries
Citation-based bootstrapping for large-scale author disambiguation

Journal of the American Society for Information Science and Technology
A brief survey of automatic methods for author name disambiguation

ACM SIGMOD Record
Author name disambiguation: What difference does it make in author-based citation analysis?

Journal of the American Society for Information Science and Technology
Characteristics of Korean personal names

Journal of the American Society for Information Science and Technology
A search engine approach to estimating temporal changes in gender orientation of first names

Proceedings of the 13th ACM/IEEE-CS joint conference on Digital libraries

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a model for estimating the probability that a pair of author names (sharing last name and first initial), appearing on two different Medline articles, refer to the same individual. The model uses a simple yet powerful similarity profile between a pair of articles, based on title, journal name, coauthor names, medical subject headings (MeSH), language, affiliation, and name attributes (prevalence in the literature, middle initial, and suffix). The similarity profile distribution is computed from reference sets consisting of pairs of articles containing almost exclusively author matches versus nonmatches, generated in an unbiased manner. Although the match set is generated automatically and might contain a small proportion of nonmatches, the model is quite robust against contamination with nonmatches. We have created a free, public service (“Author-ity”: ) that takes as input an author's name given on a specific article, and gives as output a list of all articles with that (last name, first initial) ranked by decreasing similarity, with match probability indicated. © 2005 Wiley Periodicals, Inc.