Statistical-based abbreviation expansion

Authors:
Jan Zelinka;Jan Romportl;Luděk Müller
Affiliations:
Department of Cybernetics, University of West Bohemia, Plzen, Czech Republic;Department of Cybernetics, University of West Bohemia, Plzen, Czech Republic;Department of Cybernetics, University of West Bohemia, Plzen, Czech Republic
Venue:
TSD'11 Proceedings of the 14th international conference on Text, speech and dialogue
Year:
2011

Citing 1
Cited 0

Data mining in metric space: an empirical analysis of supervised learning performance criteria

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

The work presented in this paper deals with the text normalization for highly inflectional languages. This paper is focused on abbreviation expansion and likewise on numerals normalization. Our text normalization system does not use any explicit parser or part-of-speech tagger and thus it can be called lightly supervised. The standard rule-based text normalization method is compared with the proposed statistical-based one in the task of expansion of Czech abbreviations.