Effective rank aggregation for metasearching

  • Authors:
  • Leonidas Akritidis;Dimitrios Katsaros;Panayiotis Bozanis

  • Affiliations:
  • Department of Computer and Communication Engineering, Univarsity of Thessaly, Glavani 37, Volos 38221, Greece;Department of Computer and Communication Engineering, Univarsity of Thessaly, Glavani 37, Volos 38221, Greece;Department of Computer and Communication Engineering, Univarsity of Thessaly, Glavani 37, Volos 38221, Greece

  • Venue:
  • Journal of Systems and Software
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Nowadays, mashup services and especially metasearch engines play an increasingly important role on the Web. Most of users use them directly or indirectly to access and aggregate information from more than one data sources. Similarly to the rest of the search systems, the effectiveness of a metasearch engine is mainly determined by the quality of the results it returns in response to user queries. Since these services do not maintain their own document index, they exploit multiple search engines using a rank aggregation method in order to classify the collected results. However, the rank aggregation methods which have been proposed until now, utilize a very limited set of parameters regarding these results, such as the total number of the exploited resources and the rankings they receive from each individual resource. In this paper we present QuadRank, a new rank aggregation method, which takes into consideration additional information regarding the query terms, the collected results and the data correlated to each of these results (title, textual snippet, URL, individual ranking and others). We have implemented and tested QuadRank in a real-world metasearch engine, QuadSearch, a system developed as a testbed for algorithms related to the wide problem of metasearching. The name QuadSearch is related to the current number of the exploited engines (four). We have exhaustively tested QuadRank for both effectiveness and efficiency in the real-world search environment of QuadSearch and also, using a task from the recent TREC-2009 conference. The results we present in our experiments reveal that in most cases QuadRank outperformed all component engines, another metasearch engine (Dogpile) and two successful rank aggregation methods, Borda Count and the Outranking Approach.