MINERVA∞: a scalable efficient peer-to-peer search engine

Authors:
Sebastian Michel;Peter Triantafillou;Gerhard Weikum
Affiliations:
Max-Planck-Institut für Informatik, Saarbrücken, Germany;University of Patras, Greece;Max-Planck-Institut für Informatik, Saarbrücken, Germany
Venue:
Proceedings of the ACM/IFIP/USENIX 2005 International Conference on Middleware
Year:
2005

Citing 15
Cited 11

Combining fuzzy information from multiple systems

Journal of Computer and System Sciences
Chord: A scalable peer-to-peer lookup service for internet applications

Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
A scalable content-addressable network

Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
Mining the Web: Discovering Knowledge from HyperText Data

Mining the Web: Discovering Knowledge from HyperText Data
Skip graphs

SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems

Middleware '01 Proceedings of the IFIP/ACM International Conference on Distributed Systems Platforms Heidelberg
Optimal aggregation algorithms for middleware

Journal of Computer and System Sciences - Special issu on PODS 2001
Content-based retrieval in hybrid peer-to-peer networks

CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Efficient top-K query calculation in distributed networks

Proceedings of the twenty-third annual ACM symposium on Principles of distributed computing
Meghdoot: content-based publish/subscribe over P2P networks

Proceedings of the 5th ACM/IFIP/USENIX international conference on Middleware
KLEE: a framework for distributed top-k query algorithms

VLDB '05 Proceedings of the 31st international conference on Very large data bases
SkipNet: a scalable overlay network with practical locality properties

USITS'03 Proceedings of the 4th conference on USENIX Symposium on Internet Technologies and Systems - Volume 4
Querying the internet with PIER

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Online balancing of range-partitioned data with applications to peer-to-peer systems

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Efficient peer-to-peer keyword searching

Proceedings of the ACM/IFIP/USENIX 2003 International Conference on Middleware

The SOWES approach to P2P web search using semantic overlays

Proceedings of the 15th international conference on World Wide Web
Using taxonomies for content-based routing with ants

Computer Networks: The International Journal of Computer and Telecommunications Networking
Designing a Peer-to-Peer Architecture for Distributed Image Retrieval

Adaptive Multimedial Retrieval: Retrieval, User, and Semantics
Peer-to-peer similarity search over widely distributed document collections

Proceedings of the 2008 ACM workshop on Large-Scale distributed systems for information retrieval
Aggregation of Document Frequencies in Unstructured P2P Networks

WISE '09 Proceedings of the 10th International Conference on Web Information Systems Engineering
A hybrid distributed architecture for indexing

ECDL'09 Proceedings of the 13th European conference on Research and advanced technology for digital libraries
Top-k vectorial aggregation queries in a distributed environment

Journal of Parallel and Distributed Computing
Distributed threshold querying of general functions by a difference of monotonic representation

Proceedings of the VLDB Endowment
A hybrid approach for estimating document frequencies in unstructured P2P networks

Information Systems
Peer-to-Peer Information Retrieval: An Overview

ACM Transactions on Information Systems (TOIS)
Colledge: a vision of collaborative knowledge networks

Proceedings of the 2nd International Workshop on Semantic Search over the Web

Quantified Score

Hi-index	0.00

Visualization

Abstract

The promises inherent in users coming together to form data sharing network communities, bring to the foreground new problems formulated over such dynamic, ever growing, computing, storage, and networking infrastructures. A key open challenge is to harness these highly distributed resources toward the development of an ultra scalable, efficient search engine. From a technical viewpoint, any acceptable solution must fully exploit all available resources dictating the removal of any centralized points of control, which can also readily lead to performance bottlenecks and reliability/availability problems. Equally importantly, however, a highly distributed solution can also facilitate pluralism in informing users about internet content, which is crucial in order to preclude the formation of information-resource monopolies and the biased visibility of content from economically-powerful sources. To meet these challenges, the work described here puts forward MINERVA∞, a novel search engine architecture, designed for scalability and efficiency. MINERVA∞ encompasses a suite of novel algorithms, including algorithms for creating data networks of interest, placing data on network nodes, load balancing, top-k algorithms for retrieving data at query time, and replication algorithms for expediting top-k query processing. We have implemented the proposed architecture and we report on our extensive experiments with real-world, web-crawled, and synthetic data and queries, showcasing the scalability and efficiency traits of MINERVA∞.