On the Use of Density Distribution of Keywords for Automated Generation of Hypertext Links from Arbitrary Parts of Documents

Authors:
Koichi Kise;Hiroyuki Mizuno;Masashi Yamaguchi;Keinosuke Matsumoto
Affiliations:
-;-;-;-
Venue:
ICDAR '99 Proceedings of the Fifth International Conference on Document Analysis and Recognition
Year:
1999

Citing 0
Cited 4

Spotting Where to Read on Pages - Retrieval of Relevant Parts from Page Images

DAS '02 Proceedings of the 5th International Workshop on Document Analysis Systems V
Passage-Based Document Retrieval as a Tool for Text Mining with User's Information Needs

DS '01 Proceedings of the 4th International Conference on Discovery Science
Extraction of the contents in the web texts by content-density distribution

International Journal of Knowledge Engineering and Soft Data Paradigms
Extraction of web texts using content-density distribution

AIRS'11 Proceedings of the 7th Asia conference on Information Retrieval Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a method of automated generation of hypertext links for electronic documents. The goal is to generate links from an arbitrary part of a document (a source of a link) to its relevant parts of target documents (destinations). To achieve this goal, we assume that words are often shared by parts of documents if these parts are relevant with each other.In order to extract parts densely including words of a source (keywords), we employ density distributions of keywords. This enables us to determine destinations simply by extracting parts whose density exceeds a threshold. Experiments on generating links from figures/tables to parts of documents, as well as from texts to parts of different documents show that our method with the optimal parameters yields recall of 60% and precision of 50%.