On the Use of Density Distribution of Keywords for Automated Generation of Hypertext Links from Arbitrary Parts of Documents

  • Authors:
  • Koichi Kise;Hiroyuki Mizuno;Masashi Yamaguchi;Keinosuke Matsumoto

  • Affiliations:
  • -;-;-;-

  • Venue:
  • ICDAR '99 Proceedings of the Fifth International Conference on Document Analysis and Recognition
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a method of automated generation of hypertext links for electronic documents. The goal is to generate links from an arbitrary part of a document (a source of a link) to its relevant parts of target documents (destinations). To achieve this goal, we assume that words are often shared by parts of documents if these parts are relevant with each other.In order to extract parts densely including words of a source (keywords), we employ density distributions of keywords. This enables us to determine destinations simply by extracting parts whose density exceeds a threshold. Experiments on generating links from figures/tables to parts of documents, as well as from texts to parts of different documents show that our method with the optimal parameters yields recall of 60% and precision of 50%.