Abbreviation Disambiguation: Experiments with Various Variants of the One Sense per Discourse Hypothesis

Authors:
Yaakov Hacohen-Kerner;Ariel Kass;Ariel Peretz
Affiliations:
Department of Computer Science, Jerusalem College of Technology (Machon Lev), Jerusalem, Israel 91160;Department of Computer Science, Jerusalem College of Technology (Machon Lev), Jerusalem, Israel 91160;Department of Computer Science, Jerusalem College of Technology (Machon Lev), Jerusalem, Israel 91160
Venue:
NLDB '08 Proceedings of the 13th international conference on Natural Language and Information Systems: Applications of Natural Language to Information Systems
Year:
2008

Citing 12
Cited 1

C4.5: programs for machine learning

C4.5: programs for machine learning
Support-Vector Networks

Machine Learning
Neural Networks

Neural Networks
Introduction to the special issue on word sense disambiguation: the state of the art

Computational Linguistics - Special issue on word sense disambiguation
Semi-supervised Maximum Entropy based approach to acronym and abbreviation normalization in medical texts

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
One sense per discourse

HLT '91 Proceedings of the workshop on Speech and Natural Language
One sense per collocation

HLT '93 Proceedings of the workshop on Human Language Technology
Resolving abbreviations to their senses in Medline

Bioinformatics
The SMART Retrieval System—Experiments in Automatic Document Processing

The SMART Retrieval System—Experiments in Automatic Document Processing
Building an abbreviation dictionary using a term recognition approach

Bioinformatics
WordNet: similarity - measuring the relatedness of concepts

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
LIBSVM: A library for support vector machines

ACM Transactions on Intelligent Systems and Technology (TIST)

Cuisine: Classification using stylistic feature sets and-or name-based feature sets

Journal of the American Society for Information Science and Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

Abbreviations are very common and are widely used in both written and spoken language. However, they are not always explicitly defined and in many cases they are ambiguous. In this research, we present a process that attempts to solve the problem of abbreviation ambiguity. Various features have been explored, including context-related methods and statistical methods. The application domain is Jewish Law documents written in Hebrew, which are known to be rich in ambiguous abbreviations. Various variants of the one sense per discourse hypothesis (by varying the scope of discourse) have been implemented. Several common machine learning methods have been tested to find a successful integration of these variants. The best results have been achieved by SVM, with 96.09% accuracy.