High-accuracy text search of hardcopy logs

  • Authors:
  • Yasuhiro Fujii;Ryu Ebisawa;Satoshi Kai;Takaaki Yamada;Yoshinori Honda

  • Affiliations:
  • HITACHI, Ltd., Yoshida-cho, Totsuka-ku, Yokohama-shi, Kanagawa-ken, Japan;HITACHI, Ltd., Yoshida-cho, Totsuka-ku, Yokohama-shi, Kanagawa-ken, Japan;HITACHI, Ltd., Yoshida-cho, Totsuka-ku, Yokohama-shi, Kanagawa-ken, Japan;HITACHI, Ltd., Yoshida-cho, Totsuka-ku, Yokohama-shi, Kanagawa-ken, Japan;HITACHI, Ltd., Yoshida-cho, Totsuka-ku, Yokohama-shi, Kanagawa-ken, Japan

  • Venue:
  • Proceedings of the 11th International Conference on Information Integration and Web-based Applications & Services
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Information leakage by way of paper documents has become such a serious matter that manufacturers of multifunction printers are providing a log capture device that captures document images whenever those printers are used. They are also offering a text search engine based on OCR text extraction from captured images. Since the accuracy rate of OCR is limited, this paper proposes a system increasing the accuracy of text searches in logs acquired in multifunction printers by numbering each page of paper documents and linking those ID numbers to the text data extracted from the original digital files. Experimental results show that this increases the accuracy rate of text searches from 52.4% to 98.0%.