The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Comparing the performance of database selection algorithms
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
GlOSS: text-source discovery over the Internet
ACM Transactions on Database Systems (TODS)
ACM Transactions on Information Systems (TOIS)
Partial collection replication versus caching for information retrieval systems
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Query routing for Web search engines: architectures and experiments
Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
Query-based sampling of text databases
ACM Transactions on Information Systems (TOIS)
Search and replication in unstructured peer-to-peer networks
ICS '02 Proceedings of the 16th international conference on Supercomputing
Modern Information Retrieval
A local search mechanism for peer-to-peer networks
Proceedings of the eleventh international conference on Information and knowledge management
QProber: A system for automatic classification of hidden-Web databases
ACM Transactions on Information Systems (TOIS)
Data extraction and label assignment for web databases
WWW '03 Proceedings of the 12th international conference on World Wide Web
Super-peer-based routing and clustering strategies for RDF-based peer-to-peer networks
WWW '03 Proceedings of the 12th international conference on World Wide Web
Piazza: data management infrastructure for semantic web applications
WWW '03 Proceedings of the 12th international conference on World Wide Web
Make it fresh, make it quick: searching a network of personal webservers
WWW '03 Proceedings of the 12th international conference on World Wide Web
Peer-to-peer information retrieval using self-organizing semantic overlay networks
Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications
Making gnutella-like P2P systems scalable
Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications
XRANK: ranked keyword search over XML documents
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Efficient similarity search and classification via rank aggregation
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Distributed search over the hidden web: hierarchical database sampling and selection
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Querying the internet with PIER
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Efficient IR-style keyword search over relational databases
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Locating data sources in large distributed systems
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Efficient peer-to-peer keyword searching
Proceedings of the ACM/IFIP/USENIX 2003 International Conference on Middleware
Reinforcement learning: a survey
Journal of Artificial Intelligence Research
World Wide Web caching: trends and techniques
IEEE Communications Magazine
pFusion: A P2P Architecture for Internet-Scale Content-Based Search and Retrieval
IEEE Transactions on Parallel and Distributed Systems
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Using taxonomies for content-based routing with ants
Computer Networks: The International Journal of Computer and Telecommunications Networking
Routing Queries through a Peer-to-Peer InfoBeacons Network Using Information Retrieval Techniques
IEEE Transactions on Parallel and Distributed Systems
An advertisement-based peer-to-peer search algorithm
Journal of Parallel and Distributed Computing
PCIR: Combining DHTs and peer clusters for efficient full-text P2P indexing
Computer Networks: The International Journal of Computer and Telecommunications Networking
Searching dynamic communities with personal indexes
ISWC'05 Proceedings of the 4th international conference on The Semantic Web
Using information retrieval techniques to route queries in an infobeacons network
DBISP2P'04 Proceedings of the Second international conference on Databases, Information Systems, and Peer-to-Peer Computing
Community based ranking in peer-to-peer networks
OTM'05 Proceedings of the 2005 OTM Confederated international conference on On the Move to Meaningful Internet Systems: CoopIS, COA, and ODBASE - Volume Part II
Hi-index | 0.00 |
The Internet provides a wealth of useful information in a vast number of dynamic information sources, but it is difficult to determine which sources are useful for a given query. Most existing techniques either require explicit source cooperation (for example, by exporting data summaries), or build a relatively static source characterization (for example, by assigning a topic to the source). We present a system, called InfoBeacons, that takes a different approach: data and sources are left "as is," and a peer-to-peer network of beacons uses past query results to "guide" queries to sources, who do the actual query processing. This approach has several advantages, including requiring minimal changes to sources, tolerance of dynamism and heterogeneity, and the ability to scale to large numbers of sources. We present the architecture of the system, and discuss the advantages of our design. We then focus on how a beacon can choose good sources for a query despite the loose coupling of beacons to sources. Beacons cache responses to previous queries and adapt the cache to changes at the source. The cache is then used to select good sources for future queries. We discuss results from a detailed experimental study using our beacon prototype which demonstrates that our "loosely coupled" approach is effective; a beacon only has to contact sixty percent or less of the sources contacted by existing, tightly coupled approaches, while providing results of equivalent or better relevance to queries.