A performance study of software and hardware data prefetching schemes
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
The 4+1 View Model of Architecture
IEEE Software
Massive arrays of idle disks for storage archives
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Rules of Thumb in Data Engineering
ICDE '00 Proceedings of the 16th International Conference on Data Engineering
WebTraff: A GUI for Web Proxy Cache Workload Modeling and Analysis
MASCOTS '02 Proceedings of the 10th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Energy conservation techniques for disk array-based servers
Proceedings of the 18th annual international conference on Supercomputing
Distributed caching with memcached
Linux Journal
The performance impact of kernel prefetching on buffer cache replacement algorithms
SIGMETRICS '05 Proceedings of the 2005 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Failure trends in a large disk drive population
FAST '07 Proceedings of the 5th USENIX conference on File and Storage Technologies
Flash Disk Opportunity for Server Applications
Queue - Enterprise Flash Storage
Managing Petabyte-Scale Storage for the ATLAS Tier-1 Centre at TRIUMF
HPCS '08 Proceedings of the 2008 22nd International Symposium on High Performance Computing Systems and Applications
Optimal Web cache sizing: scalable methods for exact solutions
Computer Communications
Analysis of Workload Behavior in Scientific and Historical Long-Term Data Repositories
ACM Transactions on Storage (TOS)
An evaluation of caching policies for memento timemaps
Proceedings of the 13th ACM/IEEE-CS joint conference on Digital libraries
A survey of web archive search architectures
Proceedings of the 22nd international conference on World Wide Web companion
Hi-index | 0.00 |
The Internet Archive is a live production system supporting close to a petabyte of data and delivering an average of 2.3Gb/sec of data to Internet users. We describe the architecture of this system with an emphasis on its robustness and how it is managed by a very small team of systems personnel. Notably, the current system does not employ a cache. We analyze the reasons for this decision and show that an effective cache could not be built until now. However, new solid state disk technology may offer promising new cache implementations.