SourceRank: relevance and trust assessment for deep web sources based on inter-source agreement

  • Authors:
  • Raju Balakrishnan;Subbarao Kambhampati

  • Affiliations:
  • Arizona State University, Tempe, AZ, USA;Arizona State University, Tempe, AZ, USA

  • Venue:
  • Proceedings of the 19th international conference on World wide web
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

We consider the problem of deep web source selection and argue that existing source selection methods are inadequate as they are based on local similarity assessment. Specically, they fail to account for the fact that sources can vary in trustworthiness and individual results can vary in importance. In response, we formulate a global measure to calculate relevance and trustworthiness of a source based on agreement between the answers provided by different sources. Agreement is modeled as a graph with sources at the vertices. On this agreement graph, source quality scores - namely SourceRank - are calculated as the stationary visit probability of a weighted random walk. Our experiments on online databases and 675 book sources from Google Base show that SourceRank improves relevance of the results by 25-40% over existing methods and Google Base ranking. SourceRank also reduces linearly with the corruption levels of the sources.