Searching distributed collections with inference networks
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Combining fuzzy information from multiple systems
Journal of Computer and System Sciences
Cluster-based language models for distributed retrieval
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
A decision-theoretic approach to database selection in networked IR
ACM Transactions on Information Systems (TOIS)
GlOSS: text-source discovery over the Internet
ACM Transactions on Database Systems (TODS)
Space/time trade-offs in hash coding with allowable errors
Communications of the ACM
Chord: A scalable peer-to-peer lookup service for internet applications
Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
A scalable content-addressable network
Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
Building a distributed full-text index for the web
ACM Transactions on Information Systems (TOIS)
PowerDB-IR: information retrieval on top of a database cluster
Proceedings of the tenth international conference on Information and knowledge management
Novelty and redundancy detection in adaptive filtering
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
IEEE/ACM Transactions on Networking (TON)
A language modeling framework for resource selection and results merging
Proceedings of the eleventh international conference on Information and knowledge management
Improving Data Access in P2P Systems
IEEE Internet Computing
Informed content delivery across adaptive overlay networks
Proceedings of the 2002 conference on Applications, technologies, architectures, and protocols for computer communications
Mining the Web: Discovering Knowledge from HyperText Data
Mining the Web: Discovering Knowledge from HyperText Data
Using Probabilistic Information in Data Integration
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems
Middleware '01 Proceedings of the IFIP/ACM International Conference on Distributed Systems Platforms Heidelberg
Evaluating different methods of estimating retrieval quality for resource selection
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Processing set expressions over continuous update streams
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Improving text collection selection with coverage and overlap statistics
WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
MINERVA: collaborative P2P search
VLDB '05 Proceedings of the 31st international conference on Very large data bases
The database research group at the Max-Planck Institute for Informatics
ACM SIGMOD Record
MAPS: approximate publish/subscribe functionality in peer-to-peer networks
Proceedings of the 1st international workshop on Advanced data processing in ubiquitous computing (ADPUC 2006)
Efficient peer-to-peer semantic overlay networks based on statistical language models
P2PIR '06 Proceedings of the international workshop on Information retrieval in peer-to-peer networks
ALVIS peers: a scalable full-text peer-to-peer retrieval engine
P2PIR '06 Proceedings of the international workshop on Information retrieval in peer-to-peer networks
Discovering and exploiting keyword and attribute-value co-occurrences to improve P2P routing indices
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Enhancing Search Performance on Gnutella-Like P2P Systems
IEEE Transactions on Parallel and Distributed Systems
Architecture of a grid-enabled Web search engine
Information Processing and Management: an International Journal
p2pDating: Real life inspired semantic overlay networks for Web search
Information Processing and Management: an International Journal
Web text retrieval with a P2P query-driven index
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Data allocation scheme based on term weight for P2P information retrieval
Proceedings of the 9th annual ACM international workshop on Web information and data management
Query-driven indexing for scalable peer-to-peer text retrieval
Proceedings of the 2nd international conference on Scalable information systems
Exploiting correlated keywords to improve approximate information filtering
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Meta methods for model sharing in personal information systems
ACM Transactions on Information Systems (TOIS)
Query-driven indexing for scalable peer-to-peer text retrieval
Future Generation Computer Systems
Approximate Information Filtering in Peer-to-Peer Networks
WISE '08 Proceedings of the 9th international conference on Web Information Systems Engineering
Ranking information resources in peer-to-peer text retrieval: an experimental study
Proceedings of the 2008 ACM workshop on Large-Scale distributed systems for information retrieval
Diverse peer selection in collaborative web search
Proceedings of the 2009 ACM symposium on Applied Computing
Distributed top-k aggregation queries at large
Distributed and Parallel Databases
Efficient super-peer-based queries routing
Proceedings of the International Conference on Management of Emergent Digital EcoSystems
DELOS'07 Proceedings of the 1st international conference on Digital libraries: research and development
A comparative study of pub/sub methods in structured P2P networks
DBISP2P'05/06 Proceedings of the 2005/2006 international conference on Databases, information systems, and peer-to-peer computing
Flood little, cache more: effective result-reuse in P2P IR systems
DASFAA'08 Proceedings of the 13th international conference on Database systems for advanced applications
Scalability of findability: effective and efficient IR operations in large information networks
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
PCIR: Combining DHTs and peer clusters for efficient full-text P2P indexing
Computer Networks: The International Journal of Computer and Telecommunications Networking
Cardinality estimation and dynamic length adaptation for Bloom filters
Distributed and Parallel Databases
HAPS: supporting effective and efficient full-text P2P search with peer dynamics
Journal of Computer Science and Technology
SourceRank: relevance and trust assessment for deep web sources based on inter-source agreement
Proceedings of the 20th international conference on World wide web
The context of coordinating groups in dynamic mobile networks
COORDINATION'11 Proceedings of the 13th international conference on Coordination models and languages
On the usage of global document occurrences in peer-to-peer information systems
OTM'05 Proceedings of the 2005 Confederated international conference on On the Move to Meaningful Internet Systems - Volume >Part I
A peer-to-peer architecture for information retrieval across digital library collections
ECDL'06 Proceedings of the 10th European conference on Research and Advanced Technology for Digital Libraries
IQN routing: integrating quality and novelty in P2P querying and ranking
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
Peer-to-Peer Information Retrieval: An Overview
ACM Transactions on Information Systems (TOIS)
Towards benefit-based RDF source selection for SPARQL queries
SWIM '12 Proceedings of the 4th International Workshop on Semantic Web Information Management
MinervaDL: an architecture for information retrieval and filtering in distributed digital libraries
ECDL'07 Proceedings of the 11th European conference on Research and Advanced Technology for Digital Libraries
Studying the clustering paradox and scalability of search in highly distributed environments
ACM Transactions on Information Systems (TOIS)
Assessing relevance and trust of the deep web sources and results based on inter-source agreement
ACM Transactions on the Web (TWEB)
Agreement based source selection for the multi-topic deep web integration
Proceedings of the 17th International Conference on Management of Data
Hi-index | 0.00 |
Collection selection has been a research issue for years. Typically, in related work, precomputed statistics are employed in order to estimate the expected result quality of each collection, and subsequently the collections are ranked accordingly. Our thesis is that this simple approach is insufficient for several applications in which the collections typically overlap. This is the case, for example, for the collections built by autonomous peers crawling the web. We argue for the extension of existing quality measures using estimators of mutual overlap among collections and present experiments in which this combination outperforms CORI, a popular approach based on quality estimation. We outline our prototype implementation of a P2P web search engine, coined MINERVA, that allows handling large amounts of data in a distributed and self-organizing manner. We conduct experiments which show that taking overlap into account during collection selection can drastically decrease the number of collections that have to be contacted in order to reach a satisfactory level of recall, which is a great step toward the feasibility of distributed web search.