Parallel database systems: the future of high performance database systems
Communications of the ACM
Query evaluation techniques for large databases
ACM Computing Surveys (CSUR)
File server scaling with network-attached secure disks
SIGMETRICS '97 Proceedings of the 1997 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Database management systems
A case for intelligent disks (IDISKs)
ACM SIGMOD Record
Active disks: programming model, algorithms and evaluation
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Virtual log based file systems for a programmable disk
OSDI '99 Proceedings of the third symposium on Operating systems design and implementation
Cluster I/O with River: making the fast case common
Proceedings of the sixth workshop on I/O in parallel and distributed systems
Active Storage for Large-Scale Data Mining and Multimedia
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Coloring Away Communication in Parallel Query Optimization
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Managing Intra-operator Parallelism in Parallel Database Systems
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Mariposa: a wide-area distributed database system
The VLDB Journal — The International Journal on Very Large Data Bases
Structure and Performance of Decision Support Algorithms on Active Disks
Structure and Performance of Decision Support Algorithms on Active Disks
DBC A Database Computer for Very Large Databases
IEEE Transactions on Computers
Journal of Parallel and Distributed Computing
Power-aware code scheduling for clusters of active disks
ISLPED '05 Proceedings of the 2005 international symposium on Low power electronics and design
Diamond: A Storage Architecture for Early Discard in Interactive Search
FAST '04 Proceedings of the 3rd USENIX Conference on File and Storage Technologies
Energy savings through embedded processing on disk system
ASP-DAC '06 Proceedings of the 2006 Asia and South Pacific Design Automation Conference
Distributed smart disks for I/O-intensive workloads on switched interconnects
Future Generation Computer Systems - Parallel input/output management techniques (PIOMT) in cluster and grid computing
Network and device-level impacts: performance and reliability of active I/O storage systems
The Journal of Supercomputing
Optimization of memory system in real-time embedded systems
ICCOMP'07 Proceedings of the 11th WSEAS International Conference on Computers
Distributed smart disks for I/O-intensive workloads on switched interconnects
Future Generation Computer Systems - Parallel input/output management techniques (PIOMT) in cluster and grid computing
Design and evaluation of distributed smart disk architecture for I/O-intensive workloads
ICCS'03 Proceedings of the 2003 international conference on Computational science
Diamond: a storage architecture for early discard in interactive search
FAST'04 Proceedings of the 3rd USENIX conference on File and storage technologies
Hi-index | 0.00 |
The requirements for storage space and computational power of large-scale applications are increasing rapidly. Clusters seem to be the most attractive architecture for such applications, due to their low costs and high scalability. On the other hand, smart disk systems; with their large storage, capacities and growing computational power are becoming increasingly popular. In this work, we compare the performance of these architectures with a single host-based system using representative queries from the Decision Support System (DSS) databases. We show how to implement individual database operations in the smart disk system and show how to optimize the execution of the whole query by bundling frequently occurring operations together and executing the bundle in a single invocation. Besides decreasing the overall execution time, operation bundling also offers an easy-to-program and easy-to-use interface to access the data on smart disks. We also present a protocol for minimizing the communication time in the smart disk based system. To measure the response times, we have developed the DBsim, an accurate simulator, which can simulate the database operations for the single host-based, cluster-based and smart disk, based systems. Using this simulator, we illustrate that the smart disk architecture offers substantial benefits in terms of overall query execution times of the TPC-D benchmark suite. In particular, the average response time of the smart disk architecture for the representative queries from the TPC-D benchmark in our base configuration is 71% smaller than the response time on the single host-based system and 4: 2% smaller than the response time on the fastest cluster architecture. We also demonstrate the effectiveness of the operation bundling.