Foundations of statistical natural language processing
Foundations of statistical natural language processing
Information Retrieval
Methoden zum qualitativen Vergleich von Signifikanzmaßen zur Kollokationsidentifikation
KONVENS 2000 / Sprachkommunikation, Vorträge der gemeinsamen Veranstaltung 5. Konferenz zur Verarbeitung natürlicher Sprache (KONVENS), 6. ITG-Fachtagung "Sprachkommunikation"
Accurate methods for the statistics of surprise and coincidence
Computational Linguistics - Special issue on using large corpora: I
Adaptive multilingual sentence boundary disambiguation
Computational Linguistics
Unsupervised Multilingual Sentence Boundary Detection
Computational Linguistics
CICLing'06 Proceedings of the 7th international conference on Computational Linguistics and Intelligent Text Processing
Hi-index | 0.00 |
We describe a language-independent, flexible, and accurate method for the detection of abbreviations in text corpora. It is based on the idea that an abbreviation can be viewed as a collocation, and can be identified by using methods for collocation detection such as the log likelihood ratio. Although the log likelihood ratio is known to show a good recall, its precision is poor. We employ scaling factors which lead to a strong improvement of precision. Experiments with English and German corpora show that abbreviations can be detected with high accuracy.