Locating and Recognizing Text in WWW Images
Information Retrieval
IEEE Transactions on Pattern Analysis and Machine Intelligence
Object count/area graphs for the evaluation of object detection and segmentation algorithms
International Journal on Document Analysis and Recognition
Colour text segmentation in web images based on human perception
Image and Vision Computing
An Objective Evaluation Methodology for Document Image Binarization Techniques
DAS '08 Proceedings of the 2008 The Eighth IAPR International Workshop on Document Analysis Systems
Truthing for Pixel-Accurate Segmentation
DAS '08 Proceedings of the 2008 The Eighth IAPR International Workshop on Document Analysis Systems
Text Segmentation in Colour Posters from the Spanish Civil War Era
ICDAR '09 Proceedings of the 2009 10th International Conference on Document Analysis and Recognition
Ground truth for layout analysis performance evaluation
DAS'06 Proceedings of the 7th international conference on Document Analysis Systems
A platform for storing, visualizing, and interpreting collections of noisy documents
AND '10 Proceedings of the fourth workshop on Analytics for noisy unstructured text data
Pixel accurate document image content extraction
Proceedings of the 2011 ACM Symposium on Applied Computing
Document analysis research in the year 2021
IEA/AIE'11 Proceedings of the 24th international conference on Industrial engineering and other applications of applied intelligent systems conference on Modern approaches in applied intelligence - Volume Part I
Multi-script robust reading competition in ICDAR 2013
Proceedings of the 4th International Workshop on Multilingual OCR
Hi-index | 0.00 |
The availability of open, ground-truthed datasets and clear performance metrics is a crucial factor in the development of an application domain. The domain of colour text image analysis (real scenes, Web and spam images, scanned colour documents) has traditionally suffered from a lack of a comprehensive performance evaluation framework. Such a framework is extremely difficult to specify, and corresponding pixel-level accurate information tedious to define. In this paper we discuss the challenges and technical issues associated with developing such a framework. Then, we describe a complete framework for the evaluation of text extraction methods at multiple levels, provide a detailed ground-truth specification and present a case study on how this framework can be used in a real-life situation.