Disambiguation of medline abstracts using topic models

Authors:
Mark Stevenson
Affiliations:
Sheffield University, Sheffield, United Kingdom
Venue:
Proceedings of the ACM fifth international workshop on Data and text mining in biomedical informatics
Year:
2011

Citing 11
Cited 1

Latent dirichlet allocation

The Journal of Machine Learning Research
The role of domain information in Word Sense Disambiguation

Natural Language Engineering
Word sense disambiguation by selecting the best semantic type based on Journal Descriptor Indexing: Preliminary experiment

Journal of the American Society for Information Science and Technology
A large scale, corpus-based approach for automatically disambiguating biomedical abbreviations

ACM Transactions on Information Systems (TOIS)
Domain-specific sense distributions and predominant sense acquisition

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Gene symbol disambiguation using knowledge-based profiles

Bioinformatics
Inter-coder agreement for computational linguistics

Computational Linguistics
An unsupervised vector approach to biomedical term disambiguation: integrating UMLS and Medline

HLT-SRWS '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Student Research Workshop
Personalizing PageRank for word sense disambiguation

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Graph connectivity measures for unsupervised word sense disambiguation

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Graph-based Word Sense Disambiguation of biomedical documents

Bioinformatics

DTMBIO 2011: international workshop on data and textmining in biomedical informatics

Proceedings of the 20th ACM international conference on Information and knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

Topic models are an established technique for generating information about the subjects discussed in collections of documents. Latent Dirichlet Allocation (LDA) is a widely applied topic model. The topic models generated by LDA consist of sets of terms associated with each topic and these are used to provide context for a Word Sense Disambiguation (WSD) system. It is found that using this context leads to a statistically significant improvement in the performance of a graph-based WSD system when applied to a standard evaluation resource.