Learning Abbreviations from Chinese and English Terms by Modeling Non-Local Information

Authors:
Xu Sun;Naoaki Okazaki;Jun’ichi Tsujii;Houfeng Wang
Affiliations:
Peking University;Tohoku University;Microsoft Research Asia;Peking University
Venue:
ACM Transactions on Asian Language Information Processing (TALIP)
Year:
2013

Citing 25
Cited 0

A Study of Methods for Systematically Abbreviating English Words and Names

Journal of the ACM (JACM)
Abbreviating words systematically

Communications of the ACM
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Semi-supervised Maximum Entropy based approach to acronym and abbreviation normalization in medical texts

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Shallow parsing with conditional random fields

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
One sense per collocation

HLT '93 Proceedings of the workshop on Human Language Technology
SaRAD: a Simple and Robust Abbreviation Dictionary

Bioinformatics
A large scale, corpus-based approach for automatically disambiguating biomedical abbreviations

ACM Transactions on Information Systems (TOIS)
Using SVM to Extract Acronyms from Text

Soft Computing - A Fusion of Foundations, Methodologies and Applications
ADAM: another database of abbreviations in MEDLINE

Bioinformatics
Building an abbreviation dictionary using a term recognition approach

Bioinformatics
Hidden Conditional Random Fields

IEEE Transactions on Pattern Analysis and Machine Intelligence
Exponentiated Gradient Algorithms for Conditional Random Fields and Max-Margin Markov Networks

The Journal of Machine Learning Research
Predicting chinese abbreviations from definitions: an empirical learning approach using support vector regression

Journal of Computer Science and Technology
Combined one sense disambiguation of abbreviations

HLT-Short '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers
A discriminative alignment model for abbreviation recognition

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Modeling latent-dynamic in shallow parsing: a latent conditional model with improved inference

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Sequential labeling with latent variables: an exact inference algorithm and its efficient approximation

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
A machine learning approach to acronym generation

ISMB '05 Proceedings of the ACL-ISMB Workshop on Linking Biological Literature, Ontologies and Databases: Mining Biological Semantics
Latent variable perceptron algorithm for structured classification

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Stochastic gradient descent training for L1-regularized log-linear models with cumulative penalty

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Robust approach to abbreviating terms: a discriminative latent variable model with global information

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Abbreviation generation for Japanese multi-word expressions

MWE '09 Proceedings of the Workshop on Multiword Expressions: Identification, Interpretation, Disambiguation and Applications
Identifying Abbreviation Definitions Machine Learning with Naturally Labeled Data

ICMLA '10 Proceedings of the 2010 Ninth International Conference on Machine Learning and Applications
A supervised learning approach to acronym identification

AI'05 Proceedings of the 18th Canadian Society conference on Advances in Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

The present article describes a robust approach for abbreviating terms. First, in order to incorporate non-local information into abbreviation generation tasks, we present both implicit and explicit solutions: the latent variable model and the label encoding with global information. Although the two approaches compete with one another, we find they are also highly complementary. We propose a combination of the two approaches, and we will show the proposed method outperforms all of the existing methods on abbreviation generation datasets. In order to reduce computational complexity of learning non-local information, we further present an online training method, which can arrive the objective optimum with accelerated training speed. We used a Chinese newswire dataset and a English biomedical dataset for experiments. Experiments revealed that the proposed abbreviation generator with non-local information achieved the best results for both the Chinese and English languages.