The effectiveness of GIOSS for the text database discovery problem

  • Authors:
  • Luis Gravano;Héctor García-Molina;Anthony Tomasic

  • Affiliations:
  • Stanford University, Computer Science Dept., Margaret Jacks Hall, Stanford, CA;Stanford University, Computer Science Dept., Margaret Jacks Hall, Stanford, CA;Stanford University, Computer Science Dept., Margaret Jacks Hall, Stanford, CA and Princeton University, Department of Computer Science

  • Venue:
  • SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
  • Year:
  • 1994

Quantified Score

Hi-index 0.00

Visualization

Abstract

The popularity of on-line document databases has led to a new problem: finding which text databases (out of many candidate choices) are the most relevant to a user. Identifying the relevant databases for a given query is the text database discovery problem. The first part of this paper presents a practical solution based on estimating the result size of a query and a database. The method is termed GlOSS—Glossary of Servers Server. The second part of this paper evaluates the effectiveness of GlOSS based on a trace of real user queries. In addition, we analyze the storage cost of our approach.