Searching distributed collections with inference networks
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
GlOSS: text-source discovery over the Internet
ACM Transactions on Database Systems (TODS)
Server Ranking for Distributed Text Retrieval Systems on the Internet
Proceedings of the Fifth International Conference on Database Systems for Advanced Applications (DASFAA)
XRANK: ranked keyword search over XML documents
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Comparing the performance of collection selection algorithms
ACM Transactions on Information Systems (TOIS)
Efficient keyword search for smallest LCAs in XML databases
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Interconnection semantics for keyword search in XML
Proceedings of the 14th ACM international conference on Information and knowledge management
Keyword Proximity Search in XML Trees
IEEE Transactions on Knowledge and Data Engineering
Multiway SLCA-based keyword search in XML data
Proceedings of the 16th international conference on World Wide Web
Identifying meaningful return information for XML keyword search
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
XSEarch: a semantic search engine for XML
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Effective keyword search for valuable lcas over xml documents
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Efficient keyword search over virtual XML views
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
XSeek: a semantic XML search engine using keywords
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Efficient LCA based keyword search in XML data
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Enabling Schema-Free XQuery with meaningful query focus
The VLDB Journal — The International Journal on Very Large Data Bases
Fast Indexes and Algorithms for Set Similarity Selection Queries
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Effective XML Keyword Search with Relevance Oriented Ranking
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Lowest common ancestors in trees and directed acyclic graphs
Journal of Algorithms
Fast ELCA computation for keyword queries on XML data
Proceedings of the 13th International Conference on Extending Database Technology
Top-K data source selection for keyword queries over multiple XML data sources
Journal of Information Science
Towards benefit-based RDF source selection for SPARQL queries
SWIM '12 Proceedings of the 4th International Workshop on Semantic Web Information Management
Spelling suggestion for XML keyword search based on pairwise keyword summaries
WISE'12 Proceedings of the 13th international conference on Web Information Systems Engineering
Hi-index | 0.00 |
Most of existing approaches on XML keyword search focus on querying over a single data source. However, searching over hundreds or even thousands of (distributed) data sources by sequentially querying every single data source is extremely costly, thus it can be impractical. In this paper, we propose an approach for selecting top-k data sources to a given query in order to avoid the high cost of searching numerous, potentially irrelevant data sources. The proposed approach can efficiently select top-k mostly relevant data sources without querying over the data sources. We propose a ranking function for measuring the strength of correlation between keywords in a data source and summarize the data sources as keywords correlation graphs (K-Graphs). The top-k relevant data sources will be selected by estimating the relevance of corresponding K-Graphs to the query. Experimental results show that the approach achieves good performance with a variety of experimental parameters.