ATUM: a new technique for capturing address traces using microcode
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
Address Tracing for Parallel Machines
Computer - Special issue on experimental research in computer architecture
SIGMETRICS '91 Proceedings of the 1991 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Contrasting characteristics and cache performance of technical and multi-user commercial workloads
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Memory system characterization of commercial workloads
Proceedings of the 25th annual international symposium on Computer architecture
Performance characterization of a Quad Pentium Pro SMP using OLTP workloads
Proceedings of the 25th annual international symposium on Computer architecture
MemorIES3: a programmable, real-time hardware emulation tool for multiprocessor server design
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Computer
System Optimization for OLTP Workloads
IEEE Micro
Performance Measurement Intrusion and Perturbation Analysis
IEEE Transactions on Parallel and Distributed Systems
Queue Locks on Cache Coherent Multiprocessors
Proceedings of the 8th International Symposium on Parallel Processing
Temporal-Based Procedure Reordering for Improved Instruction Cache Performance
HPCA '98 Proceedings of the 4th International Symposium on High-Performance Computer Architecture
System-oriented evaluation of i/o subsystem performance
System-oriented evaluation of i/o subsystem performance
Profile-directed restructuring of operating system code
IBM Systems Journal
A multithreaded PowerPC processor for commercial servers
IBM Journal of Research and Development
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Comprehensive multiprocessor cache miss rate generation using multivariate models
ACM Transactions on Computer Systems (TOCS)
Interconnections in Multi-Core Architectures: Understanding Mechanisms, Overheads and Scaling
Proceedings of the 32nd annual international symposium on Computer Architecture
High-Performance Throughput Computing
IEEE Micro
Maximizing CMP Throughput with Mediocre Cores
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Performance/Watt: the new server focus
ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
The RASE (Rapid, Accurate Simulation Environment) for chip multiprocessors
ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
A case study in top-down performance estimation for a large-scale parallel application
Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Communist, utilitarian, and capitalist cache policies on CMPs: caches as a shared resource
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Comprehensive multivariate extrapolation modeling of multiprocessor cache miss rates
ACM Transactions on Computer Systems (TOCS)
Determining output uncertainty of computer system models
Performance Evaluation
Performance modeling for early analysis of multi-core systems
CODES+ISSS '07 Proceedings of the 5th IEEE/ACM international conference on Hardware/software codesign and system synthesis
Efficient architectural design space exploration via predictive modeling
ACM Transactions on Architecture and Code Optimization (TACO)
The coming wave of multithreaded chip multiprocessors
International Journal of Parallel Programming
ACS'07 Proceedings of the 7th Conference on 7th WSEAS International Conference on Applied Computer Science - Volume 7
Exploring power management in multi-core systems
Proceedings of the 2008 Asia and South Pacific Design Automation Conference
PicoServer: Using 3D stacking technology to build energy efficient servers
ACM Journal on Emerging Technologies in Computing Systems (JETC)
PIRR: a methodology for distributed network management in mobile networks
WSEAS Transactions on Information Science and Applications
Defining relevant distances between server workloads
Performance Evaluation
ICOSSSE'08 Proceedings of the 7th WSEAS international conference on System science and simulation in engineering
Performance of large low-associativity caches
ACM SIGMETRICS Performance Evaluation Review
Software agents as a versatile simulation tool to model complex systems
WSEAS Transactions on Information Science and Applications
Hi-index | 0.00 |
This paper discusses a methodology for analyzing and optimizing the performance of commercial servers. Commercial server workloads are shown to have unique characteristics which expand the elements that must be optimized to achieve good performance and require a unique performance methodology. The steps in the process of server performance optimization are described and include the following: 1. Selection of representative commercial workloads and identification of key characteristics to be evaluated. 2. Collection of performance data. Various instrumentation techniques are discussed in light of the requirements placed by commercial server workloads on the instrumentation. 3. Creation of input data for performance models on the basis of measured workload information. This step in the methodology must overcome the operating environment differences between the instance of the measured system under test and the target system design to be modeled. 4. Creation of performance models. Two general types are described: high-level models and detailed cycle-accurate simulators. These types are applied to model the processor, memory, and I/O system. 5. System performance optimization. The tuning of the operating system and application software is described. Optimization of performance among commercial applications is not simply an exercise in using traces to maximize the processor MIPS. Equally significant are items such as the use of probabilities to reflect future workload characteristics, software tuning, cache miss rate optimization, memory management, and I/O performance. The paper presents techniques for evaluating the performance of each of these key contributors so as to optimize the overall performance and cost/performance of commercial servers.