U-Net: a user-level network interface for parallel and distributed computing
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
The impact of architectural trends on operating system performance
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
The HP AutoRAID hierarchical storage system
ACM Transactions on Computer Systems (TOCS) - Special issue on operating system principles
Design of the TruCluster multicomputer system for the Digital UNIX environment
Digital Technical Journal
File server scaling with network-attached secure disks
SIGMETRICS '97 Proceedings of the 1997 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Performance characterization of a Quad Pentium Pro SMP using OLTP workloads
Proceedings of the 25th annual international symposium on Computer architecture
A cost-effective, high-bandwidth storage architecture
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
UTLB: a mechanism for address translation on network interfaces
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Fast Messages: Efficient, Portable Communication for Workstation Clusters and MPPs
IEEE Parallel & Distributed Technology: Systems & Technology
DBMSs on a Modern Processor: Where Does Time Go?
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
The Multi-Queue Replacement Algorithm for Second Level Buffer Caches
Proceedings of the General Track: 2002 USENIX Annual Technical Conference
Overview of memory channel network for PCI
COMPCON '96 Proceedings of the 41st IEEE International Computer Conference
User-Level Communication in Cluster-Based Servers
HPCA '02 Proceedings of the 8th International Symposium on High-Performance Computer Architecture
Cheating the I/O bottleneck: network storage with Trapeze/Myrinet
ATEC '98 Proceedings of the annual conference on USENIX Annual Technical Conference
Database research at the University of Illinois at Urbana-Champaign
ACM SIGMOD Record
miNI: reducing network interface memory requirements with dynamic handle lookup
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
High performance RDMA-based MPI implementation over InfiniBand
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
Dynamic Data Replication: An Approach to Providing Fault-Tolerant Shared Memory Clusters
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Performance measurements of a user-space DAFS server with a database workload
NICELI '03 Proceedings of the ACM SIGCOMM workshop on Network-I/O convergence: experience, lessons, implications
Application performance on the Direct Access File System
WOSP '04 Proceedings of the 4th international workshop on Software and performance
Second-Level Buffer Cache Management
IEEE Transactions on Parallel and Distributed Systems
PB-LRU: a self-tuning power aware storage cache replacement algorithm for conserving disk energy
Proceedings of the 18th annual international conference on Supercomputing
Power-Aware Storage Cache Management
IEEE Transactions on Computers
PRESS: A Clustered Server Based on User-Level Communication
IEEE Transactions on Parallel and Distributed Systems
Empirical evaluation of multi-level buffer cache collaboration for storage systems
SIGMETRICS '05 Proceedings of the 2005 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
IEEE Transactions on Parallel and Distributed Systems
High performance support of parallel virtual file system (PVFS2) over Quadrics
Proceedings of the 19th annual international conference on Supercomputing
Hibernator: helping disk arrays sleep through the winter
Proceedings of the twentieth ACM symposium on Operating systems principles
High performance RDMA-based MPI implementation over infiniBand
International Journal of Parallel Programming - Special issue I: The 17th annual international conference on supercomputing (ICS'03)
Design Trade-Offs for User-Level I/O Architectures
IEEE Transactions on Computers
SNAPI '03 Proceedings of the international workshop on Storage network architecture and parallel I/Os
Efficient remote block-level I/O over an RDMA-capable NIC
Proceedings of the 20th annual international conference on Supercomputing
An SSL Back-End Forwarding Scheme in Cluster-Based Web Servers
IEEE Transactions on Parallel and Distributed Systems
Optimization and bottleneck analysis of network block I/O in commodity storage systems
Proceedings of the 21st annual international conference on Supercomputing
Benefits of high speed interconnects to cluster file systems: a case study with lustre
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Scalable memory registration for high performance networks using helper threads
Proceedings of the 8th ACM International Conference on Computing Frontiers
Providing safe, user space access to fast, solid state disks
ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
RXIO: Design and implementation of high performance RDMA-capable GridFTP
Computers and Electrical Engineering
Hi-index | 0.01 |
This paper examines how VI-based interconnects can be used to improve I/O path performance between a database server and the storage subsystem. We design and implement a software layer, DSA, that is layered between the application and VI. DSA takes advantage of specific VI features and deals with many of its shortcomings. We provide and evaluate one kernel-level and two user-level implementations of DSA. These implementations trade transparency and generality for performance at different degrees, and unlike research prototypes are designed to be suitable for real-world deployment. We present detailed measurements using a commercial database management system with both micro-benchmarks and industrial database workloads on a mid-size, 4 CPU, and a large, 32 CPU, database server.Our results show that VI-based interconnects and user-level communication can improve all aspects of the I/O path between the database system and the storage back-end. We also find that to make effective use of VI in I/O intensive environments we need to provide substantial additional functionality than what is currently provided by VI. Finally, new storage APIs that help minimize kernel involvement in the I/O path are needed to fully exploit the benefits of VI-based communication.