Searching distributed collections with inference networks
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Nonparametric methods for quantitative analysis (3rd ed.)
Nonparametric methods for quantitative analysis (3rd ed.)
GlOSS: text-source discovery over the Internet
ACM Transactions on Database Systems (TODS)
Optimal aggregation algorithms for middleware
PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Statistical synopses for graph-structured XML databases
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Estimating Answer Sizes for XML Queries
EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
Counting Twig Matches in a Tree
Proceedings of the 17th International Conference on Data Engineering
Server Ranking for Distributed Text Retrieval Systems on the Internet
Proceedings of the Fifth International Conference on Database Systems for Advanced Applications (DASFAA)
XRANK: ranked keyword search over XML documents
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Comparing the performance of collection selection algorithms
ACM Transactions on Information Systems (TOIS)
Selectivity Estimation for XML Twigs
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Efficient keyword search for smallest LCAs in XML databases
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Interconnection semantics for keyword search in XML
Proceedings of the 14th ACM international conference on Information and knowledge management
Keyword Proximity Search in XML Trees
IEEE Transactions on Knowledge and Data Engineering
XSEED: Accurate and Fast Cardinality Estimation for XPath Queries
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Multiway SLCA-based keyword search in XML data
Proceedings of the 16th international conference on World Wide Web
Identifying meaningful return information for XML keyword search
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Structure and value synopses for XML data graphs
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
XSEarch: a semantic search engine for XML
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Effective keyword search for valuable lcas over xml documents
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
XSeek: a semantic XML search engine using keywords
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Efficient LCA based keyword search in XML data
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Enabling Schema-Free XQuery with meaningful query focus
The VLDB Journal — The International Journal on Very Large Data Bases
Efficient keyword search over virtual XML views
The VLDB Journal — The International Journal on Very Large Data Bases
Effective XML Keyword Search with Relevance Oriented Ranking
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Lowest common ancestors in trees and directed acyclic graphs
Journal of Algorithms
Fast ELCA computation for keyword queries on XML data
Proceedings of the 13th International Conference on Extending Database Technology
Suggestion of promising result types for XML keyword search
Proceedings of the 13th International Conference on Extending Database Technology
K-graphs: selecting top-k data sources for XML keyword queries
DEXA'11 Proceedings of the 22nd international conference on Database and expert systems applications - Volume Part I
Journal of Information Science
Hi-index | 0.00 |
With the proliferation of XML data, searching XML data using keyword queries has attracted much attention. However, most of the current approaches focus on keyword-based searches over a single XML document. Searching over a system integrating hundreds or even thousands of data sources by sequentially querying every single source is extremely costly, and thus may be impractical. In this article we propose a novel approach for selecting the top-K data sources by relying on their relevance to a given query, to avoid the high cost of searching in numerous, potentially irrelevant data sources. Our approach summarizes the data sources as succinct synopses for the rapid filtering of non-promising sources. We maintain both structural and value distribution information of each data source, and propose a novel ranking function to measure effectively the relevance of the data source to the given query. We conducted experiments with real datasets, and results show that our approach achieves high performances in all evaluation metrics: recall, precision and Spearman's rank correlation coefficient with different experimental parameters.