Learning Multilingual Morphology with CLOG
ILP '98 Proceedings of the 8th International Workshop on Inductive Logic Programming
Proceedings of The Third Workshop on Analytics for Noisy Unstructured Text Data
Efficiently generating correction suggestions for garbled tokens of historical language
Natural Language Engineering
Lexicon construction and corpus annotation of historical language with the CoBaLT editor
LaTeCH '12 Proceedings of the 6th Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities
Hi-index | 0.00 |
The paper describes a tool developed to process historical (Slovene) text, which annotates words in a TEI encoded corpus with their modern-day equivalents, morphosyntactic tags and lemmas. Such a tool is useful for developing historical corpora of highly-inflecting languages, enabling full text search in digital libraries of historical texts, for modernising such texts for today's readers and making it simpler to correct OCR transcriptions.