Document page retrieval based on geometric layout features

  • Authors:
  • Hong Yan;Toyohide Watanabe

  • Affiliations:
  • Nagoya University, Furo-cho, Chikusa-ku, Japan;Nagoya University, Furo-cho, Chikusa-ku, Japan

  • Venue:
  • Proceedings of the 7th International Conference on Ubiquitous Information Management and Communication
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Today, the keyword retrieval method is most standard and popular, and has been widely used in many applications. However, even the keyword retrieval method cannot always satisfy various types of information search subjects, because various kinds of information resources such as image data, graphics data, etc. must be managed in multi-media society, in addition to the word-dependent information. Of course, the methods which are more or less applicable to the characteristics of data resources such as structure, design, application, usage, volumn, etc., are necessary to make information-based activities of users successful. In this paper, we address a document page retrieval method based on the characteristics of page layout structure. Although the keyword retrieval method is a very excellent means in document page retrieval, we must pay attention to the case that keyword are not necessarily effective: it is not easy for foreigners to use keywords in different language or it is difficult for children to remember unknown words. In our method, the main idea is to focus on the geometric/positional relationships between characteristic components in identifying the document pages. Moreover, our original viewport is to introduce the inverted index, used commonly in the conventional information retrieval systems, but not to make use of structural/spatial relationship between characteristic components, which are standard in traditional map retrieval systems.