Image Collector II: A System to Gather a Large Number of Images from the Web

  • Authors:
  • Keiji Yanai

  • Affiliations:
  • The author is with the Department of Computer Science, The University of Electro-Communications, Chofu-shi, 182--8585 Japan. E-mail: yanai@cs.uec.ac.jp

  • Venue:
  • IEICE - Transactions on Information and Systems
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose a system that enables us to gather hundreds of images related to one set of keywords provided by a user from the World Wide Web. The system is called Image Collector II. The Image Collector, which we proposed previously, can gather only one or two hundreds of images. We propose the two following improvements on our previous system in terms of the number of gathered images and their precision: (1) We extract some words appearing with high frequency from all HTML files in which output images are embedded in an initial image gathering, and using them as keywords, we carry out a second image gathering. Through this process, we can obtain hundreds of images for one set of keywords. (2) The more images we gather, the more the precision of gathered images decreases. To improve the precision, we introduce word vectors of HTML files embedding images into the image selecting process in addition to image feature vectors.