Query evaluation techniques for large databases
ACM Computing Surveys (CSUR)
Optimal tracing and replay for debugging shared-memory parallel programs
PADD '93 Proceedings of the 1993 ACM/ONR workshop on Parallel and distributed debugging
On the self-similar nature of Ethernet traffic (extended version)
IEEE/ACM Transactions on Networking (TON)
Cluster I/O with River: making the fast case common
Proceedings of the sixth workshop on I/O in parallel and distributed systems
A study of memory system performance of multimedia applications
Proceedings of the 2001 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals
Data Mining and Knowledge Discovery
Empirical evaluation of multi-level buffer cache collaboration for storage systems
SIGMETRICS '05 Proceedings of the 2005 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Crash Data Collection: A Windows Case Study
DSN '05 Proceedings of the 2005 International Conference on Dependable Systems and Networks
Awarded Best Paper! - Using MEMS-Based Storage in Disk Arrays
FAST '03 Proceedings of the 2nd USENIX Conference on File and Storage Technologies
Passive NFS Tracing of Email and Research Workloads
FAST '03 Proceedings of the 2nd USENIX Conference on File and Storage Technologies
Automatic logging of operating system effects to guide application-level architecture simulation
SIGMETRICS '06/Performance '06 Proceedings of the joint international conference on Measurement and modeling of computer systems
InteMon: intelligent system monitoring on large clusters
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Disk failures in the real world: what does an MTTF of 1,000,000 hours mean to you?
FAST '07 Proceedings of the 5th USENIX conference on File and Storage Technologies
Failure trends in a large disk drive population
FAST '07 Proceedings of the 5th USENIX conference on File and Storage Technologies
Traveling to Rome: a retrospective on the journey
ACM SIGOPS Operating Systems Review
Capture, conversion, and analysis of an intense NFS workload
FAST '09 Proccedings of the 7th conference on File and storage technologies
LazyBase: freshness vs. performance in information management
ACM SIGOPS Operating Systems Review
ACM SIGOPS Operating Systems Review
Improving the efficiency of information collection and analysis in widely-used IT applications
Proceedings of the 2nd ACM/SPEC International Conference on Performance engineering
LazyBase: trading freshness for performance in a scalable database
Proceedings of the 7th ACM european conference on Computer Systems
Analysis of Workload Behavior in Scientific and Historical Long-Term Data Repositories
ACM Transactions on Storage (TOS)
Extracting flexible, replayable models from large block traces
FAST'12 Proceedings of the 10th USENIX conference on File and Storage Technologies
Structured and Interoperable Logging for the Cloud Computing Era: The Pitfalls and Benefits
UCC '13 Proceedings of the 2013 IEEE/ACM 6th International Conference on Utility and Cloud Computing
Virtual machine workloads: the case for new benchmarks for NAS
FAST'13 Proceedings of the 11th USENIX conference on File and Storage Technologies
FAST'14 Proceedings of the 12th USENIX conference on File and Storage Technologies
Hi-index | 0.00 |
Structured serial data is used in many scientific fields; such data sets consist of a series of records, and are typically written once, read many times, chronologically ordered, and read sequentially. In this paper we introduce DataSeries, an on-disk format, run-time library and set of tools for storing and analyzing structured serial data. We identify six key properties of a system to store and analyze this type of data, and describe how DataSeries was designed to provide these properties. We quantify the benefits of DataSeries through several experiments. In particular, we demonstrate that DataSeries exceeds the performance of common trace formats by at least a factor of two.