Answering queries using templates with binding patterns (extended abstract)
PODS '95 Proceedings of the fourteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Query reformulation for dynamic information integration
Journal of Intelligent Information Systems - Special issue on intelligent integration of information
The TSIMMIS Approach to Mediation: Data Models and Languages
Journal of Intelligent Information Systems - Special issue: next generation information technologies and systems
XL: a platform for web services
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Access path selection in a relational database management system
SIGMOD '79 Proceedings of the 1979 ACM SIGMOD international conference on Management of data
Information Integration Using Logical Views
ICDT '97 Proceedings of the 6th International Conference on Database Theory
Query Optimization in the Presence of Foreign Functions
VLDB '93 Proceedings of the 19th International Conference on Very Large Data Bases
Including Group-By in Query Optimization
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Querying Heterogeneous Information Sources Using Source Descriptions
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Learning response time for WebSources using query feedback and application in query optimization
The VLDB Journal — The International Journal on Very Large Data Bases
Efficient Access to Web Services
IEEE Internet Computing
Bidirectional expansion for keyword search on graph databases
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Query optimization over web services
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Damia: a data mashup fabric for intranet applications
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Query Planning for Searching Inter-dependent Deep-Web Databases
SSDBM '08 Proceedings of the 20th international conference on Scientific and Statistical Database Management
Scalable multi-query optimization for exploratory queries over federated scientific databases
Proceedings of the VLDB Endowment
Optimization of multi-domain queries on the web
Proceedings of the VLDB Endowment
Learning to create data-integrating queries
Proceedings of the VLDB Endowment
Querying Data under Access Limitations
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Explaining and Reformulating Authority Flow Queries
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Hi-index | 0.00 |
A large part of the data on the World Wide Web resides in the deep web. Most deep web data sources only support simple text interfaces for querying them, which are easy to use but have limited expressive power. Therefore, processing complex structured queries over the deep web currently involves a large amount of manual work. Our work focuses on addressing the existing gap between users' need of expressing and executing complex structured queries over the deep web, and the simple and limited input interfaces of the existing deep web data sources. This paper presents a query planning problem formulation, with novel algorithms and optimizations, for enabling a high-level and highly expressive query language to be supported over deep web data sources. We particularly target three types of complex queries, which are select-project-join queries, aggregation queries, and nested queries. We have developed query planning algorithms to generate query plans for each of these, and propose several optimization techniques to further speedup query plan execution. In our experiments, we show our algorithm has good scalability and furthermore, for over 90% of the experimental queries, the execution time and result quality of the query plans generated by our algorithms are very close to the optimal plans generated by an exhaustive search algorithm. Furthermore, our optimization techniques outperform an existing optimization method in terms of both reduction in transmitted data records and query execution speedups.