Measurements of a distributed file system
SOSP '91 Proceedings of the thirteenth ACM symposium on Operating systems principles
File archive activity in a supercomputing environment
ICS '93 Proceedings of the 7th international conference on Supercomputing
Efficient organization and access of multi-dimensional datasets on tertiary storage systems
Information Systems - Special issue: scientific databases
File system usage in Windows NT 4.0
Proceedings of the seventeenth ACM symposium on Operating systems principles
Massive arrays of idle disks for storage archives
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
A comparison of file system workloads
ATEC '00 Proceedings of the annual conference on USENIX Annual Technical Conference
A five-year study of file-system metadata
FAST '07 Proceedings of the 5th USENIX conference on File and Storage Technologies
Analysis of Long Term File Reference Patterns for Application to File Migration Algorithms
IEEE Transactions on Software Engineering
Pergamum: replacing tape with energy efficient, reliable, disk-based archival storage
FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
Measurement and analysis of large-scale network file system workloads
ATC'08 USENIX 2008 Annual Technical Conference on Annual Technical Conference
Capture, conversion, and analysis of an intense NFS workload
FAST '09 Proccedings of the 7th conference on File and storage technologies
Design implications for enterprise storage systems via multi-dimensional trace analysis
SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
The purge threat: scientists' thoughts on peta-scale usability
Proceedings of the sixth workshop on Parallel Data Storage
Analysis of Workload Behavior in Scientific and Historical Long-Term Data Repositories
ACM Transactions on Storage (TOS)
Evolutionary Trends in a Supercomputing Tertiary Storage Environment
MASCOTS '12 Proceedings of the 2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems
Hi-index | 0.00 |
Archival storage systems for scientific data have been growing in both size and relevance over the past two decades, yet researchers and system designers alike must rely on limited and obsolete knowledge to guide archival management and design. To address this issue, we analyzed three years of file-level activities from the NCAR mass storage system, providing valuable insight into a large-scale scientific archive with over 1600 users, tens of millions of files, and petabytes of data. Our examination of system usage showed that, while a subset of users were responsible for most of the activity, this activity was widely distributed at the file level. We also show that the physical grouping of files and directories on media can improve archival storage system performance. Based on our observations, we provide suggestions and guidance for both future scientific archival system designs as well as improved tracing of archival activity.