Degraded text recognition using visual and linguistic context
Degraded text recognition using visual and linguistic context
Term selection for searching printed Arabic
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Improving stemming for Arabic information retrieval: light stemming and co-occurrence analysis
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Probabilistic Retrieval of OCR Degraded Text Using N-Grams
ECDL '97 Proceedings of the First European Conference on Research and Advanced Technology for Digital Libraries
A Faster Algorithm for Approximate String Matching
CPM '96 Proceedings of the 7th Annual Symposium on Combinatorial Pattern Matching
Towards a single proposal in spelling correction
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Language model based arabic word segmentation
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
A morphologically sensitive clustering algorithm for identifying Arabic roots
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
An improved error model for noisy channel spelling correction
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Speech and Language Processing (2nd Edition)
Speech and Language Processing (2nd Edition)
Examining the effect of improved context sensitive morphology on Arabic information retrieval
Semitic '05 Proceedings of the ACL Workshop on Computational Approaches to Semitic Languages
Error correction vs. query garbling for Arabic OCR document retrieval
ACM Transactions on Information Systems (TOIS)
Effect of OCR error correction on Arabic retrieval
Information Retrieval
Improving optical character recognition through efficient multiple system alignment
Proceedings of the 9th ACM/IEEE-CS joint conference on Digital libraries
Efficient Language-Independent Retrieval of Printed Documents without OCR
SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
Using deep morphology to improve automatic error detection in Arabic handwriting recognition
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
An unsupervised and data-driven approach for spell checking in Vietnamese OCR-scanned texts
HYBRID '12 Proceedings of the Workshop on Innovative Hybrid Approaches to the Processing of Textual Data
International Journal of Knowledge-based and Intelligent Engineering Systems
Hi-index | 0.00 |
This paper explores the use of a character segment based character correction model, language modeling, and shallow morphology for Arabic OCR error correction. Experimentation shows that character segment based correction is superior to single character correction and that language modeling boosts correction, by improving the ranking of candidate corrections, while shallow morphology had a small adverse effect. Further, given sufficiently large corpus to extract a dictionary and to train a language model, word based correction works well for a morphologically rich language such as Arabic.