Understanding the impact of X86/NT computing on microarchitecture

Authors:
Ravi Bhargava;Juan Rubio;Srikanth Kannan;Lizy K. John;David Christie;Leo Klaes
Affiliations:
Univ. of Texas at Austin, Austin;Univ. of Texas at Austin, Austin;Univ. of Texas at Austin, Austin;Univ. of Texas at Austin, Austin;Advanced Micro Devices;Advanced Micro Devices
Venue:
Workload characterization of emerging computer applications
Year:
2001

Citing 14
Cited 1

Cache performance of operating system and multiprogramming workloads

ACM Transactions on Computer Systems (TOCS)
Contrasting characteristics and cache performance of technical and multi-user commercial workloads

ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
The impact of architectural trends on operating system performance

SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
The measured performance of personal computer operating systems

SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
Using hybrid branch predictors to improve branch prediction accuracy in the presence of context switches

ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
An analysis of dynamic branch prediction schemes on system workloads

ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Using latency to evaluate interactive system performance

OSDI '96 Proceedings of the second USENIX symposium on Operating systems design and implementation
Performance characterization of a Quad Pentium Pro SMP using OLTP workloads

Proceedings of the 25th annual international symposium on Computer architecture
Execution characteristics of desktop applications on Windows NT

Proceedings of the 25th annual international symposium on Computer architecture
A hardware-driven profiling scheme for identifying program hot spots to support runtime optimization

ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
A hardware mechanism for dynamic extraction and relayout of program hot spots

Proceedings of the 27th annual international symposium on Computer architecture
SPEC CPU2000: Measuring CPU Performance in the New Millennium

Computer
Performance Characterization of the Pentium® Pro Processor

HPCA '97 Proceedings of the 3rd IEEE Symposium on High-Performance Computer Architecture
Operating System Impact on Trace-Driven Simulation

SS '98 Proceedings of the The 31st Annual Simulation Symposium

Automatic logging of operating system effects to guide application-level architecture simulation

SIGMETRICS '06/Performance '06 Proceedings of the joint international conference on Measurement and modeling of computer systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many performance evaluation studies in computer architecture rely almost exclusively on simulation of the dynamic instructions stream from a single application. The benchmarks used are often CPU intensive and rely very little on the operating system, such as the SPEC benchmarks. However, a majority of computer systems are subjected to a different class of workloads where these common practices may not accurately reflect all performance issues. For example, operating system activity and context switches are ignored because many popular simulators and tracing techniques do not support the additional complexity. The main goal of the research is to understand the effects on the microarchitecture of operating system calls and context switches in a common computing environment. This work analyzes applications running in the ubiquitous. Microsoft Windows environment using an x86 processor. Microarchitecture structures such as their instruction and data caches, TLB, and branch predictor are investigated in detail. The behavior of application and operating code is studied to derive a complete picture of the execution behavior of these applications. In addition, a series of desktop and database applications are presented and compared with the SPEC CPU2000 suite. This analysis is conducted using a hardware tracer capable of tracing all activity including operating system calls and context switches. We observe that the dynamic instruction stream of desktop and database applications contain 19 % to 78% operating system activity whereas SPE2000 applications typically involve less than 1% operating system activity. Not only are there more operating system calls, the average number of instruction executed on each entry into the operating system is higher for desktop and database applications. Data generated by the operating system and applications can interfere with each other. This results in more misses in the caches, more interference in the branch predictor, and worse TLB performance. We find that simulations with applications code alone are not ideal for evaluating performance of microarchitecture enhancements for many programs, especially database and desktop applications. Simulators and tracers capable of handling all system activity are essential for obtaining meaningful results for typical applications that interact with the operating system and for application in multiple-program environment.