Combining fuzzy information from multiple systems
Journal of Computer and System Sciences
Chord: A scalable peer-to-peer lookup service for internet applications
Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
A scalable content-addressable network
Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
Mining the Web: Discovering Knowledge from HyperText Data
Mining the Web: Discovering Knowledge from HyperText Data
SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems
Middleware '01 Proceedings of the IFIP/ACM International Conference on Distributed Systems Platforms Heidelberg
Optimal aggregation algorithms for middleware
Journal of Computer and System Sciences - Special issu on PODS 2001
Content-based retrieval in hybrid peer-to-peer networks
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Efficient top-K query calculation in distributed networks
Proceedings of the twenty-third annual ACM symposium on Principles of distributed computing
Meghdoot: content-based publish/subscribe over P2P networks
Proceedings of the 5th ACM/IFIP/USENIX international conference on Middleware
KLEE: a framework for distributed top-k query algorithms
VLDB '05 Proceedings of the 31st international conference on Very large data bases
SkipNet: a scalable overlay network with practical locality properties
USITS'03 Proceedings of the 4th conference on USENIX Symposium on Internet Technologies and Systems - Volume 4
Querying the internet with PIER
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Online balancing of range-partitioned data with applications to peer-to-peer systems
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Efficient peer-to-peer keyword searching
Proceedings of the ACM/IFIP/USENIX 2003 International Conference on Middleware
The SOWES approach to P2P web search using semantic overlays
Proceedings of the 15th international conference on World Wide Web
Using taxonomies for content-based routing with ants
Computer Networks: The International Journal of Computer and Telecommunications Networking
Designing a Peer-to-Peer Architecture for Distributed Image Retrieval
Adaptive Multimedial Retrieval: Retrieval, User, and Semantics
Peer-to-peer similarity search over widely distributed document collections
Proceedings of the 2008 ACM workshop on Large-Scale distributed systems for information retrieval
Aggregation of Document Frequencies in Unstructured P2P Networks
WISE '09 Proceedings of the 10th International Conference on Web Information Systems Engineering
A hybrid distributed architecture for indexing
ECDL'09 Proceedings of the 13th European conference on Research and advanced technology for digital libraries
Top-k vectorial aggregation queries in a distributed environment
Journal of Parallel and Distributed Computing
Distributed threshold querying of general functions by a difference of monotonic representation
Proceedings of the VLDB Endowment
Peer-to-Peer Information Retrieval: An Overview
ACM Transactions on Information Systems (TOIS)
Colledge: a vision of collaborative knowledge networks
Proceedings of the 2nd International Workshop on Semantic Search over the Web
Hi-index | 0.00 |
The promises inherent in users coming together to form data sharing network communities, bring to the foreground new problems formulated over such dynamic, ever growing, computing, storage, and networking infrastructures. A key open challenge is to harness these highly distributed resources toward the development of an ultra scalable, efficient search engine. From a technical viewpoint, any acceptable solution must fully exploit all available resources dictating the removal of any centralized points of control, which can also readily lead to performance bottlenecks and reliability/availability problems. Equally importantly, however, a highly distributed solution can also facilitate pluralism in informing users about internet content, which is crucial in order to preclude the formation of information-resource monopolies and the biased visibility of content from economically-powerful sources. To meet these challenges, the work described here puts forward MINERVA∞, a novel search engine architecture, designed for scalability and efficiency. MINERVA∞ encompasses a suite of novel algorithms, including algorithms for creating data networks of interest, placing data on network nodes, load balancing, top-k algorithms for retrieving data at query time, and replication algorithms for expediting top-k query processing. We have implemented the proposed architecture and we report on our extensive experiments with real-world, web-crawled, and synthetic data and queries, showcasing the scalability and efficiency traits of MINERVA∞.