Hidden Markov models for speech recognition
Technometrics
Disk failures in the real world: what does an MTTF of 1,000,000 hours mean to you?
FAST '07 Proceedings of the 5th USENIX conference on File and Storage Technologies
OpenFlow: enabling innovation in campus networks
ACM SIGCOMM Computer Communication Review
NOX: towards an operating system for networks
ACM SIGCOMM Computer Communication Review
Undetected disk errors in RAID arrays
IBM Journal of Research and Development
Dynamic resource allocation for database servers running on virtual storage
FAST '09 Proccedings of the 7th conference on File and storage technologies
Capture, conversion, and analysis of an intense NFS workload
FAST '09 Proccedings of the 7th conference on File and storage technologies
Modeling the Fault Tolerance Consequences of Deduplication
SRDS '11 Proceedings of the 2011 IEEE 30th International Symposium on Reliable Distributed Systems
Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data
Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data
Characteristics of backup workloads in production systems
FAST'12 Proceedings of the 10th USENIX conference on File and Storage Technologies
Extracting flexible, replayable models from large block traces
FAST'12 Proceedings of the 10th USENIX conference on File and Storage Technologies
scc: cluster storage provisioning informed by application characteristics and SLAs
FAST'12 Proceedings of the 10th USENIX conference on File and Storage Technologies
Digital forensic implications of ZFS
Digital Investigation: The International Journal of Digital Forensics & Incident Response
A framework for efficient evaluation of the fault tolerance of deduplicated storage systems
DSN '12 Proceedings of the 2012 42nd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)
Hi-index | 0.00 |
As both the amount of data to be stored and the rate of data production grows, data center designers and operators face the challenge of planning and managing systems whose characteristics, workloads, and performance, availability, and reliability goals change rapidly. As we move towards software-defined data centers (SDDCs) the ability to reconfigure and adapt our solutions is increasing, but to take full advantage of that increase we must design smarter, more intelligent systems that are aware of how they are being used and able to deliver accurate predictions of their characteristics, workloads, and goals. In this paper we propose a novel algorithm for use in an intelligent, user-aware SDDC which performs run-time analysis of user storage system activity in a manner that has a minimal impact on performance and provides accurate estimations of future user activity. Our algorithm can produce both generalized models, and specific models, depending on the parameters used. Our algorithms are efficient, and have low overhead, making them ideal to use to add intelligence to SDDCs and build intelligent storage systems. We use our algorithm to analyze actual data from two real systems, monitoring user activity two and three times a day for each system respectively, over a period of roughly two years, for almost 500 distinct users.