Non-volatile memory for fast, reliable file systems
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
The design and implementation of a log-structured file system
ACM Transactions on Computer Systems (TOCS)
Initial Performance Evaluation of the Cray SeaStar Interconnect
HOTI '05 Proceedings of the 13th Symposium on High Performance Interconnects
Analysis and evolution of journaling file systems
ATEC '05 Proceedings of the annual conference on USENIX Annual Technical Conference
Metadata update performance in file systems
OSDI '94 Proceedings of the 1st USENIX conference on Operating Systems Design and Implementation
Journaling versus soft updates: asynchronous meta-data protection in file systems
ATEC '00 Proceedings of the annual conference on USENIX Annual Technical Conference
Cray XT4: an early evaluation for petascale scientific simulation
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Communications of the ACM - Web science
Adaptable, metadata rich IO methods for portable high performance IO
IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
ACM Transactions on Storage (TOS)
Characterizing output bottlenecks in a supercomputer
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Active flash: towards energy-efficient, in-situ data analytics on extreme-scale machines
FAST'13 Proceedings of the 11th USENIX conference on File and Storage Technologies
Automatic identification of application I/O signatures from noisy server-side traces
FAST'14 Proceedings of the 12th USENIX conference on File and Storage Technologies
Hi-index | 0.00 |
Journaling is a widely used technique to increase file system robustness against metadata and/or data corruptions. While the overhead of journaling can be masked by the page cache for small-scale, local file systems, we found that Lustre's use of journaling for the object store significantly impacted the overall performance of our large-scale centerwide parallel file system. By requiring that each write request wait for a journal transaction to commit, Lustre introduced serialization to the client request stream and imposed additional latency due to disk head movement (seeks) for each request. In this paper, we present the challenges we faced while deploying a very large scale production storage system. Our work provides a head-to-head comparison of two significantly different approaches to increasing the overall efficiency of the Lustre file system. First, we present a hardware solution using external journaling devices to eliminate the latencies incurred by the extra disk head seeks due to journaling. Second, we introduce a software-based optimization to remove the synchronous commit for each write request, side-stepping additional latency and amortizing the journal seeks across a much larger number of requests. Both solutions have been implemented and experimentally tested on our Spider storage system, a very large scale Lustre deployment. Our tests show both methods considerably improve the write performance, in some cases up to 93%. Testing with a real-world scientific application showed a 37% decrease in the number journal updates, each with an associated seek - which translated into an average I/O bandwidth improvement of 56.3%.