Predicting chinese abbreviations from definitions: an empirical learning approach using support vector regression

Authors:
Xu Sun;Hou-Feng Wang;Bo Wang
Affiliations:
Institute of Computational Linguistics, School of Electronics Engineering and Computer Science, Peking University, Beijing, China and Graduate School of Information Science and Technology, The Uni ...;Institute of Computational Linguistics, School of Electronics Engineering and Computer Science, Peking University, Beijing, China;Institute of Computational Linguistics, School of Electronics Engineering and Computer Science, Peking University, Beijing, China
Venue:
Journal of Computer Science and Technology
Year:
2008

Citing 9
Cited 3

Algorithms for bigram and trigram word clustering

Speech Communication
An introduction to support Vector Machines: and other kernel-based learning methods

An introduction to support Vector Machines: and other kernel-based learning methods
An efficient method for determining bilingual word classes

EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
A tutorial on support vector regression

Statistics and Computing
SaRAD: a Simple and Robust Abbreviation Dictionary

Bioinformatics
A machine learning approach to acronym generation

ISMB '05 Proceedings of the ACL-ISMB Workshop on Linking Biological Literature, Ontologies and Databases: Mining Biological Semantics
Chinese abbreviation-definition identification: a SVM approach using context information

PRICAI'06 Proceedings of the 9th Pacific Rim international conference on Artificial intelligence
Chinese abbreviation identification using abbreviation-template features and context information

ICCPOL'06 Proceedings of the 21st international conference on Computer Processing of Oriental Languages: beyond the orient: the research challenges ahead
A hybrid approach to chinese abbreviation expansion

ICCPOL'06 Proceedings of the 21st international conference on Computer Processing of Oriental Languages: beyond the orient: the research challenges ahead

Robust approach to abbreviating terms: a discriminative latent variable model with global information

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Chinese new word identification: a latent discriminative model with global features

Journal of Computer Science and Technology - Special issue on natural language processing
Learning Abbreviations from Chinese and English Terms by Modeling Non-Local Information

ACM Transactions on Asian Language Information Processing (TALIP)

Quantified Score

Hi-index	0.00

Visualization

Abstract

In Chinese, phrases and named entities play a central role in information retrieval. Abbreviations, however, make keyword-based approaches less effective. This paper presents an empirical learning approach to Chinese abbreviation prediction. In this study, each abbreviation is taken as a reduced form of the corresponding definition (expanded form), and the abbreviation prediction is formalized as a scoring and ranking problem among abbreviation candidates, which are automatically generated from the corresponding definition. By employing Support Vector Regression (SVR) for scoring, we can obtain multiple abbreviation candidates together with their SVR values, which are used for candidate ranking. Experimental results show that the SVR method performs better than the popular heuristic rule of abbreviation prediction. In addition, in abbreviation prediction, the SVR method outperforms the hidden Markov model (HMM).