Collection fusion using Bayesian estimation of a linear regression model in image databases on the Web

  • Authors:
  • Deok-Hwan Kim;Chin-Wan Chung

  • Affiliations:
  • Department of Information and Communication Engineering, Korea Advanced Institute of Science and Technology, 373-1, Kusong-dong, Yusong-gu, Taejon 305-701, South Korea;Department of Electrical Engineering and Computer Science, Korea Advanced Institute of Science and Technology, 373-1, Kusong-dong, Yusong-gu, Taejon 305-701, South Korea

  • Venue:
  • Information Processing and Management: an International Journal - Modelling vagueness and subjectivity in information access
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

The collection fusion problem of image databases is concerned with retrieving relevant images by content based retrieval from image databases distributed on the Web. While there have been many studies about database selection and collection fusion for text databases, little research has been attempted for the case of image databases. Image databases on the Web have heterogeneous characteristics since they use different similiarity measures and queries are processed depending on their own policies. Our previous study [Inf. Process. Lett. 75 (1-2) (2000) 35] provided three algorithms for this problem. In this paper, the metaserver selects image databases supporting similarity measures that are correlated with a global similarity measure, and then submits a query to them. And, we propose a new algorithm for this metaserver, which exploits a probabilistic technique using Bayesian estimation for a linear regression model. It outperforms the previous approach for diverse sizes of result sets for a query, and its improvement in effectiveness becomes especially large with small sizes of result sets. We also provide a virtual optimal algorithm to which our algorithm is compared. With extensive experiments we show the superiority of the Bayesian method over the others.