Script-agnostic reflow of text in document images

Authors:
Saurabh Panjwani;Abhinav Uppal;Edward Cutrell
Affiliations:
Bell Labs India;IIT Delhi;Microsoft Research India
Venue:
Proceedings of the 13th International Conference on Human Computer Interaction with Mobile Devices and Services
Year:
2011

Citing 4
Cited 0

A Prototype Document Image Analysis System for Technical Journals

Computer
Language model based arabic word segmentation

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Eye movement study of reading text on a mobile phone using paging, scrolling, leading, and RSVP

Proceedings of the 6th international conference on Mobile and ubiquitous multimedia
Document Image Segmentation as a Spectral Partitioning Problem

ICVGIP '08 Proceedings of the 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Reading text from document images can be difficult on mobile devices due to the limited screen width available on them. While there exist solutions for reflowing Latin-script texts on such devices, these solutions do not work well for images of other scripts or combinations of scripts, since they rely on script-specific characteristics or OCR. We present a technique that reflows text in document images in a manner that is agnostic to the script used to compose them. Our technique achieved over 95% segmentation accuracy for a corpus of 139 images containing text in 4 genetically-distant languages-English, Hindi, Kannada and Arabic. A preliminary user study with a prototype implementation of the technique provided evidence of some of its usability benefits.