Query evaluation techniques for large databases
ACM Computing Surveys (CSUR)
The SEQUOIA 2000 storage benchmark
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Predicate migration: optimizing queries with expensive predicates
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Performance tradeoffs for client-server query processing
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Secure and portable database extensibility
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Access path selection in a relational database management system
SIGMOD '79 Proceedings of the 1979 ACM SIGMOD international conference on Management of data
Don't Scrap It, Wrap It! A Wrapper Architecture for Legacy Data Sources
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Optimizing Queries Across Diverse Data Sources
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
The Case for Enhanced Abstract Data Types
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
R* Optimizer Validation and Performance Evaluation for Distributed Queries
VLDB '86 Proceedings of the 12th International Conference on Very Large Data Bases
Optimization of Queries with User-defined Predicates
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Scaling heterogeneous databases and the design of Disco
ICDCS '96 Proceedings of the 16th International Conference on Distributed Computing Systems (ICDCS '96)
Cache investment: integrating query optimization and distributed data placement
ACM Transactions on Database Systems (TODS)
ADC '02 Proceedings of the 13th Australasian database conference - Volume 5
A WFS-based mediation system for GIS interoperability
Proceedings of the 10th ACM international symposium on Advances in geographic information systems
Processing large-scale multi-dimensional data in parallel and distributed environments
Parallel Computing - Parallel data-intensive algorithms and applications
Efficient Manipulation of Large Datasets on Heterogeneous Storage Systems
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Efficient Code Deployment for Heterogeneous Distributed Data Sources
ADVIS '02 Proceedings of the Second International Conference on Advances in Information Systems
QuDAS: A QoS-Based Brokering Architecture for Data Services
DBTel '01 Proceedings of the VLDB 2001 International Workshop on Databases in Telecommunications II
Exploiting and Completing Web Data Sources Capabilities
Proceedings of the VLDB 2002 Workshop EEXTT and CAiSE 2002 Workshop DTWeb on Efficiency and Effectiveness of XML Tools and Techniques and Data Integration over the Web-Revised Papers
Efficient Querying of Distributed Resources in Mediator Systems
On the Move to Meaningful Internet Systems, 2002 - DOA/CoopIS/ODBASE 2002 Confederated International Conferences DOA, CoopIS and ODBASE 2002
GIScience '02 Proceedings of the Second International Conference on Geographic Information Science
Active Proxy-G: optimizing the query execution process in the grid
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Database management issues in the web environment
Effective databases for text & document management
Optimizing the Execution of Multiple Data Analysis Queries on Parallel and Distributed Environments
IEEE Transactions on Parallel and Distributed Systems
CoDIMS-G: a data and program integration service for the grid
MGC '04 Proceedings of the 2nd workshop on Middleware for grid computing
Bio-Broker: a tool for integration of biological data sources and data analysis tools
Software—Practice & Experience
Multiple range query optimization with distributed cache indexing
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
International Journal of Hybrid Intelligent Systems
A grid-based approach for enterprise-scale data mining
Future Generation Computer Systems - Special section: Data mining in grid computing environments
A grid-based approach for enterprise-scale data mining
Future Generation Computer Systems - Special section: Data mining in grid computing environments
Form-based proxy caching for database-backed web sites: keywords and functions
The VLDB Journal — The International Journal on Very Large Data Bases
Toward automatic parallelization of spatial computation for computing clusters
HPDC '08 Proceedings of the 17th international symposium on High performance distributed computing
Enabling OLAP in mobile environments via intelligent data cube compression techniques
Journal of Intelligent Information Systems
Multiple query scheduling for distributed semantic caches
Journal of Parallel and Distributed Computing
Power-aware operator placement and broadcasting of continuous query results
Proceedings of the Ninth ACM International Workshop on Data Engineering for Wireless and Mobile Access
Catalogue manager for metadata dissemination in the NetTraveler middleware system
International Journal of Intelligent Information and Database Systems
ALT'05 Proceedings of the 16th international conference on Algorithmic Learning Theory
An agent-based approach for cooperative data management
APWeb'06 Proceedings of the 8th Asia-Pacific Web conference on Frontiers of WWW Research and Development
EDBT'04 Proceedings of the 2004 international conference on Current Trends in Database Technology
Hi-index | 0.00 |
We present MOCHA, a new self-extensible database middleware system designed to interconnect distributed data sources. MOCHA is designed to scale to large environments and is based on the idea that some of the user-defined functionality in the system should be deployed by the middleware system itself. This is realized by shipping Java code implementing either advanced data types or tailored query operators to remote data sources and have it executed remotely. Optimized query plans push the evaluation of powerful data-reducing operators to the data source sites while executing data-inflating operators near the client's site. The Volume Reduction Factor is a new and more explicit metric introduced in this paper to select the best site to execute query operators and is shown to be more accurate than the standard selectivity factor alone. MOCHA has been implemented in Java and runs on top of Informix and Oracle. We present the architecture of MOCHA, the ideas behind it, and a performance study using scientific data and queries. The results of this study demonstrate that MOCHA provides a more flexible, scalable and efficient framework for distributed query processing compared to those in existing middleware solutions.