A guided tour to approximate string matching
ACM Computing Surveys (CSUR)
ECDL '02 Proceedings of the 6th European Conference on Research and Advanced Technology for Digital Libraries
Textual indexation of ancient documents
Proceedings of the 2005 ACM symposium on Document engineering
ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Hybrid OCR combination approach complemented by a specialized ICR applied on ancient documents
ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
A hierarchical, HMM-based automatic evaluation of OCR accuracy for a digital library of books
Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries
DEBORA: Digital AccEss to BOoks of the RenAissance
International Journal on Document Analysis and Recognition
A new generation of textual corpora: mining corpora from very large collections
Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
Text search for medieval manuscript images
Pattern Recognition
An Overview of the Tesseract OCR Engine
ICDAR '07 Proceedings of the Ninth International Conference on Document Analysis and Recognition - Volume 02
On Using Classical Poetry Structure for Indian Language Post-Processing
ICDAR '07 Proceedings of the Ninth International Conference on Document Analysis and Recognition - Volume 02
ICDAR '07 Proceedings of the Ninth International Conference on Document Analysis and Recognition - Volume 02
Automated OCR Ground Truth Generation
DAS '08 Proceedings of the 2008 The Eighth IAPR International Workshop on Document Analysis Systems
Improving optical character recognition through efficient multiple system alignment
Proceedings of the 9th ACM/IEEE-CS joint conference on Digital libraries
Non-interactive OCR post-correction for giga-scale digitization projects
CICLing'08 Proceedings of the 9th international conference on Computational linguistics and intelligent text processing
Beyond digital incunabula: modeling the next generation of digital libraries
ECDL'06 Proceedings of the 10th European conference on Research and Advanced Technology for Digital Libraries
An OCR post-processing approach based on multi-knowledge
KES'05 Proceedings of the 9th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part I
Why multiple document image binarizations improve OCR
Proceedings of the 2nd International Workshop on Historical Document Imaging and Processing
Hi-index | 0.00 |
This paper describes a work-flow designed to populate a digital library of ancient Greek critical editions with highly accurate OCR scanned text. While the most recently available OCR engines are now able after suitable training to deal with the polytonic Greek fonts used in 19th and 20th century editions, further improvements can also be achieved with postprocessing. In particular, the progressive multiple alignment method applied to different OCR outputs based on the same images is discussed in this paper.