Document image similarity and equivalence detection
ICDAR '97 Proceedings of the 4th International Conference on Document Analysis and Recognition
Hi-index | 0.00 |
By proper exploitation of the structural characteristics existing in a compressed document, it is possible to speed up certain image processing operations. Alternatively, one can derive a compression scheme which would lend itself to an efficient manipulation of documents without compromising the compression factor. Here, a run-based compression technique is discussed for binary documents. The technique, in addition to achieving bit rates comparable to other compression schemes, preserves document features which are useful for analysis and manipulation of data. Algorithms are proposed to perform vertical run extraction, and similar operations in the compressed domain. These algorithms are implemented in software. Experimental results indicate that fast analysis of electronic data is possible if data is coded according to the proposed scheme.