Distributed top-N query processing with possibly uncooperative local systems

  • Authors:
  • Clement Yu;George Philip;Weiyi Meng

  • Affiliations:
  • Dept. of Computer Science, U. of Illinois at Chicago, Chicago, IL;Dept. of Computer Science, U. of Illinois at Chicago, Chicago, IL;Dept. of Computer Science, SUNY at Binghamton, Binghamton, NY

  • Venue:
  • VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

We consider the problem of processing top-N queries in a distributed environment with possibly uncooperative local database systems. For a given top-N query, the problem is to find the N tuples that satisfy the query the best but not necessarily completely in an efficient manner. Top-N queries are gaining popularity in relational databases and are expected to be very useful for e-commerce applications. Many companies provide the same type of goods and services to the public on the Web, and relational databases may be employed to manage the data. It is not feasible for a user to query a large number of databases. It is therefore desirable to provide a facility where a user query is accepted at some site, suitable tuples from appropriate sites are retrieved and the results are merged and then presented to the user. In this paper, we present a method for constructing the desired facility. Our method consists of two steps. The first step determines which databases are likely to contain the desired tuples for a given query so that the databases can be ranked based on their desirability with respect to the query. Four different techniques are introduced for this step with one requiring no cooperation from local systems. The second step determines how the ranked databases should be searched and what tuples from the searched databases should be returned. A new algorithm is proposed for this purpose. Experimental results are presented to compare different methods and very promising results are obtained using the method that requires no cooperation from local databases.