Prototype Extraction and Adaptive OCR
IEEE Transactions on Pattern Analysis and Machine Intelligence
Using the Gamera framework for the recognition of cultural heritage materials
Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries
Issues in Ground-Truthing Graphic Documents
GREC '01 Selected Papers from the Fourth International Workshop on Graphics Recognition Algorithms and Applications
UW-ISL Document Image Analysis Toolbox: An Experimental Environment
ICDAR '97 Proceedings of the 4th International Conference on Document Analysis and Recognition
Data GroundTruth, Complexity, and Evaluation Measures for Color Document Analysis
DAS '02 Proceedings of the 5th International Workshop on Document Analysis Systems V
Why Table Ground-Truthing is Hard
ICDAR '01 Proceedings of the Sixth International Conference on Document Analysis and Recognition
Mixed-initiative development of language processing systems
ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Computer assisted visual interactive recognition: caviar
Computer assisted visual interactive recognition: caviar
Performance evaluation for text processing of noisy inputs
Proceedings of the 2005 ACM symposium on Applied computing
Cooperation and quality in wikipedia
Proceedings of the 2007 international symposium on Wikis
Optical character recognition errors and their effects on natural language processing
Proceedings of the second workshop on Analytics for noisy unstructured text data
Notes on contemporary table recognition
DAS'06 Proceedings of the 7th international conference on Document Analysis Systems
A platform for storing, visualizing, and interpreting collections of noisy documents
AND '10 Proceedings of the fourth workshop on Analytics for noisy unstructured text data
Hi-index | 0.00 |
Developing better systems for document image analysis requires understanding errors, their sources, and their effects. The interactions between various processing steps are complex, with details that can be obscured by the statistical methods that are employed in many cases. In this paper, we describe tools we are building to help the user view and understand the results of common document analysis procedures. Unlike existing platforms for ground-truthing page images, our system also allows users to visualize the results of automated error analyses. Recognition errors can be corrected interactively, with the effort to do so recorded as a measure that is useful in performance evaluation. Beyond this functionality for exploring error behavior, we consider how such tools could be designed to improve the quality of collections of badly recognized documents incrementally as users interact with them on a regular basis. We conclude by discussing topics for future research.