The effectiveness of GIOSS for the text database discovery problem
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Searching distributed collections with inference networks
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Approximation algorithms for NP-hard problems
Summary cache: a scalable wide-area web cache sharing protocol
IEEE/ACM Transactions on Networking (TON)
Min-wise independent permutations
Journal of Computer and System Sciences - 30th annual ACM symposium on theory of computing
Space/time trade-offs in hash coding with allowable errors
Communications of the ACM
Counting Distinct Elements in a Data Stream
RANDOM '02 Proceedings of the 6th International Workshop on Randomization and Approximation Techniques
Relevant document distribution estimation method for resource selection
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Index structures and algorithms for querying distributed RDF repositories
Proceedings of the 13th international conference on World Wide Web
Improving text collection selection with coverage and overlap statistics
WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
Improving collection selection with overlap awareness in P2P search engines
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Federated text retrieval from uncooperative overlapped collections
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Bloom histogram: path selectivity estimation for XML data with updates
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Distinct-value synopses for multiset operations
Communications of the ACM - A View of Parallel Computing
Semantic Search --- Using Graph-Structured Semantic Models for Supporting the Search Process
ICCS '09 Proceedings of the 17th International Conference on Conceptual Structures: Conceptual Structures: Leveraging Semantic Technologies
RDFStats - An Extensible RDF Statistics Generator and Library
DEXA '09 Proceedings of the 2009 20th International Workshop on Database and Expert Systems Application
Data summaries for on-demand queries over linked data
Proceedings of the 19th international conference on World wide web
A semantic web middleware for virtual data integration on the web
ESWC'08 Proceedings of the 5th European semantic web conference on The semantic web: research and applications
Graph summaries for subgraph frequency estimation
ESWC'08 Proceedings of the 5th European semantic web conference on The semantic web: research and applications
Querying distributed RDF data sources with SPARQL
ESWC'08 Proceedings of the 5th European semantic web conference on The semantic web: research and applications
Cardinality estimation and dynamic length adaptation for Bloom filters
Distributed and Parallel Databases
SourceRank: relevance and trust assessment for deep web sources based on inter-source agreement
Proceedings of the 20th international conference on World wide web
Theory and applications of b-bit minwise hashing
Communications of the ACM
Semantics and optimization of the SPARQL 1.1 federation extension
ESWC'11 Proceedings of the 8th extended semantic web conference on The semanic web: research and applications - Volume Part II
Introduction to linked data and its lifecycle on the web
RW'11 Proceedings of the 7th international conference on Reasoning web: semantic technologies for the web of data
Approximate and incremental processing of complex queries against the web of data
DEXA'11 Proceedings of the 22nd international conference on Database and expert systems applications - Volume Part II
K-graphs: selecting top-k data sources for XML keyword queries
DEXA'11 Proceedings of the 22nd international conference on Database and expert systems applications - Volume Part I
FedX: optimization techniques for federated query processing on linked data
ISWC'11 Proceedings of the 10th international conference on The semantic web - Volume Part I
Colledge: a vision of collaborative knowledge networks
Proceedings of the 2nd International Workshop on Semantic Search over the Web
Hi-index | 0.00 |
The Linked Data cloud consists of a great variety of data provided by an increasing number of sources. Selecting relevant sources is therefore a core ingredient of efficient query processing. So far, this is either done with additional indexes or by iteratively performing lookups for relevant URIs. None of the existing methods takes additional aspects into account such as the degree of overlap between the sources, resulting in unnecessary requests. In this paper, we propose a sketch-based query routing strategy that takes source overlap into account. The proposed strategy uses sketches and can be tuned towards either retrieving as many results as possible for a given budget or minimizing the number of requests necessary to retrieve all or a certain fraction of the results. Our experiments show significant improvements over state-of-the-art but overlap-ignorant methods for source selection.