A case for redundant arrays of inexpensive disks (RAID)
SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
Sequentiality and prefetching in database systems
ACM Transactions on Database Systems (TODS)
Towards a theory of cache-efficient algorithms
Journal of the ACM (JACM)
Understanding The Linux Kernel
Understanding The Linux Kernel
CFLRU: a replacement algorithm for flash memory
CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
CLOCK-Pro: an effective improvement of the CLOCK replacement
ATEC '05 Proceedings of the annual conference on USENIX Annual Technical Conference
Linear hashing: a new tool for file and table addressing
VLDB '80 Proceedings of the sixth international conference on Very Large Data Bases - Volume 6
Investigation of leading HPC I/O performance using a scientific-application derived benchmark
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
AWOL: an adaptive write optimizations layer
FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Factored operating systems (fos): the case for a scalable operating system for multicores
ACM SIGOPS Operating Systems Review
The multikernel: a new OS architecture for scalable multicore systems
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
PLFS: a checkpoint filesystem for parallel applications
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Benchmarking cloud serving systems with YCSB
Proceedings of the 1st ACM symposium on Cloud computing
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Multithreaded Asynchronous Graph Traversal for In-Memory and Semi-External Memory
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
An analysis of Linux scalability to many cores
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
FIOS: a fair, efficient flash I/O scheduler
FAST'12 Proceedings of the 10th USENIX conference on File and Storage Technologies
A parallel page cache: IOPS and caching for multicore systems
HotStorage'12 Proceedings of the 4th USENIX conference on Hot Topics in Storage and File Systems
Hi-index | 0.00 |
We describe a storage system that removes I/O bottlenecks to achieve more than one million IOPS based on a userspace file abstraction for arrays of commodity SSDs. The file abstraction refactors I/O scheduling and placement for extreme parallelism and non-uniform memory and I/O. The system includes a set-associative, parallel page cache in the user space. We redesign page caching to eliminate CPU overhead and lock-contention in non-uniform memory architecture machines. We evaluate our design on a 32 core NUMA machine with four, eight-core processors. Experiments show that our design delivers 1.23 million 512-byte read IOPS. The page cache realizes the scalable IOPS of Linux asynchronous I/O (AIO) and increases user-perceived I/O performance linearly with cache hit rates. The parallel, set-associative cache matches the cache hit rates of the global Linux page cache under real workloads.