KMV-peer: a robust and adaptive peer-selection algorithm

Authors:
Yosi Mass;Yehoshua Sagiv;Michal Shmueli-Scheuer
Affiliations:
IBM Haifa Research Lab and The Hebrew University, Jerusalem, Haifa, Israel;The Hebrew University, Jerusalem, Jerusalem, Israel;IBM Haifa Research Lab, Haifa, Israel
Venue:
Proceedings of the fourth ACM international conference on Web search and data mining
Year:
2011

Citing 23
Cited 0

IR evaluation methods for retrieving highly relevant documents

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Optimal aggregation algorithms for middleware

PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Chord: A scalable peer-to-peer lookup service for internet applications

Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems

Middleware '01 Proceedings of the IFIP/ACM International Conference on Distributed Systems Platforms Heidelberg
Counting Distinct Elements in a Data Stream

RANDOM '02 Proceedings of the 6th International Workshop on Randomization and Approximation Techniques
Web Search for a Planet: The Google Cluster Architecture

IEEE Micro
PlanetP: Using Gossiping to Build Content Addressable Peer-to-Peer Information Sharing Communities

HPDC '03 Proceedings of the 12th IEEE International Symposium on High Performance Distributed Computing
Relevant document distribution estimation method for resource selection

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Efficient top-K query calculation in distributed networks

Proceedings of the twenty-third annual ACM symposium on Principles of distributed computing
User modeling for full-text federated search in peer-to-peer networks

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Discovering and exploiting keyword and attribute-value co-occurrences to improve P2P routing indices

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
On synopses for distinct-value estimation under multiset operations

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Full-text federated search in peer-to-peer networks

ACM SIGIR Forum
Web text retrieval with a P2P query-driven index

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Exploiting correlated keywords to improve approximate information filtering

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Query-driven indexing for scalable peer-to-peer text retrieval

Future Generation Computer Systems
Query Workload Driven Summarization for P2P Query Routing

P2P '08 Proceedings of the 2008 Eighth International Conference on Peer-to-Peer Computing
Efficient query routing by improved peer description in P2P networks

Proceedings of the 3rd international conference on Scalable information systems
The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines

The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines
On the feasibility of multi-site web search engines

Proceedings of the 18th ACM conference on Information and knowledge management
A scalable and effective full-text search in P2P networks

Proceedings of the 18th ACM conference on Information and knowledge management
Central-rank-based collection selection in uncooperative distributed information retrieval

ECIR'07 Proceedings of the 29th European conference on IR research
A peer-selection algorithm for information retrieval

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

The problem of fully decentralized search over many collections is considered. The objective is to approximate the results of centralized search (namely, using a central index) while controlling the communication cost and involving only a small number of collections. The proposed solution is couched in a peer-to-peer (P2P) network, but can also be applied in other setups. Peers publish per-term summaries of their collections. Specifically, for each term, the range of document scores is divided into intervals; and for each interval, a KMV (K Minimal Values) synopsis of its documents is created. A new peer-selection algorithm uses the KMV synopses and two scoring functions in order to adaptively rank the peers, according to the relevance of their documents to a given query. The proposed method achieves high-quality results while meeting the above criteria of efficiency. In particular, experiments are done on two large, real-world datasets; one is blogs and the other is web data. These experiments show that the algorithm outperforms the state-of-the-art approaches and is robust over different collections, various scoring functions and multi-term queries.