Document Signature Using Intrinsic Features for Counterfeit Detection
IWCF '08 Proceedings of the 2nd international workshop on Computational Forensics
Automatic Line Orientation Measurement for Questioned Document Examination
IWCF '09 Proceedings of the 3rd International Workshop on Computational Forensics
Text versus non-text distinction in online handwritten documents
Proceedings of the 2010 ACM Symposium on Applied Computing
Proceedings of the 2010 ACM Symposium on Applied Computing
Semi-supervised learning for text-line detection
Pattern Recognition Letters
Table detection in heterogeneous documents
DAS '10 Proceedings of the 9th IAPR International Workshop on Document Analysis Systems
An open approach towards the benchmarking of table structure recognition systems
DAS '10 Proceedings of the 9th IAPR International Workshop on Document Analysis Systems
Document image segmentation using discriminative learning over connected components
DAS '10 Proceedings of the 9th IAPR International Workshop on Document Analysis Systems
Document inspection using text-line alignment
DAS '10 Proceedings of the 9th IAPR International Workshop on Document Analysis Systems
Recognition driven page orientation detection
ICIP'09 Proceedings of the 16th IEEE international conference on Image processing
Textline information extraction from grayscale camera-captured document images
ICIP'09 Proceedings of the 16th IEEE international conference on Image processing
Picture detection in document page images
Proceedings of the 10th ACM symposium on Document engineering
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part II
Performance metrics for activity recognition
ACM Transactions on Intelligent Systems and Technology (TIST)
Pixel accurate document image content extraction
Proceedings of the 2011 ACM Symposium on Applied Computing
A new algorithm for segmenting warped text-lines in document images
Proceedings of the 2011 ACM Symposium on Applied Computing
Automatic localization of page segmentation errors
Proceedings of the 2011 Joint Workshop on Multilingual OCR and Analytics for Noisy Unstructured Text Data
A new method for text-line segmentation for warped documents
ICIAR'10 Proceedings of the 7th international conference on Image Analysis and Recognition - Volume Part II
Decapod: a flexible, low cost digitization solution for small and medium archives
CBDAR'11 Proceedings of the 4th international conference on Camera-Based Document Analysis and Recognition
The IUPR dataset of camera-captured document images
CBDAR'11 Proceedings of the 4th international conference on Camera-Based Document Analysis and Recognition
Automatic localization and correction of line segmentation errors
Proceeding of the workshop on Document Analysis and Recognition
SmartDCap: semi-automatic capture of higher quality document images from a smartphone
Proceedings of the 2013 international conference on Intelligent user interfaces
Hi-index | 0.14 |
Informative benchmarks are crucial for optimizing the page segmentation step of an OCR system, frequently the performance limiting step for overall OCR system performance. We show that current evaluation scores are insufficient for diagnosing specific errors in page segmentation and fail to identify some classes of serious segmentation errors altogether. This paper introduces a vectorial score that is sensitive to, and identifies, the most important classes of segmentation errors (over-, under-, and mis-segmentation) and what page components (lines, blocks, etc.) are affected. Unlike previous schemes, our evaluation method has a canonical representation of ground truth data and guarantees pixel-accurate evaluation results for arbitrary region shapes. We present the results of evaluating widely used segmentation algorithms (x-y cut, smearing, whitespace analysis, constrained text-line finding, docstrum, and Voronoi) on the UW-III database and demonstrate that the new evaluation scheme permits the identification of several specific flaws in individual segmentation methods.