Paged similarity queries

  • Authors:
  • Enzo Seraphim;Thatyana F. Piola Seraphim;Edmilson M. Moreira;Fábio C. M. Ricotta;Caetano Traina, Jr.

  • Affiliations:
  • Institute of Engineering of Systems and Technology Information, Federal University of Itajubá(MG), Brazil;Institute of Engineering of Systems and Technology Information, Federal University of Itajubá(MG), Brazil;Institute of Engineering of Systems and Technology Information, Federal University of Itajubá(MG), Brazil;Institute of Engineering of Systems and Technology Information, Federal University of Itajubá(MG), Brazil;Institute of Mathematics and Computing, University of São Paulo at São Carlos(SP), Brazil

  • Venue:
  • Information Sciences: an International Journal
  • Year:
  • 2011

Quantified Score

Hi-index 0.07

Visualization

Abstract

An important feature of a database management systems (DBMS) is its client/server architecture, where managing shared memory among the clients and the server is always an tough issue. However, similarity queries are specially sensitive to this kind of architecture, since the answer sizes vary widely. Usually, the answers of similarity query are fully processed to be sent in full to the user, who often is interested in just parts of the answer, e.g. just few elements closer or farther to the query reference. Compelling the DBMS to retrieve the full answer, further ignoring its majority is at least a waste of server processing power. Paging the answer is a technique that splits the answer onto several pages, following client requests. Despite the success of paging on traditional queries, little work has been done to support it in similarity queries. In this work, we present a technique that not only provides paging in similarity range or k-nearest neighbor queries, but also supports them in two variations: the forward similarity query and the backward similarity query. They return elements either increasingly farther of increasingly closer to the query reference. The reported experiments show that, depending on the proportion of the interesting part over the full answer, both techniques allow answering queries much faster than it is obtained in the non-paged way.