The detection and elimination of useless misses in multiprocessors
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Generating representative Web workloads for network and server performance evaluation
SIGMETRICS '98/PERFORMANCE '98 Proceedings of the 1998 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Memory system characterization of commercial workloads
Proceedings of the 25th annual international symposium on Computer architecture
Piranha: a scalable architecture based on single-chip multiprocessing
Proceedings of the 27th annual international symposium on Computer architecture
An analysis of operating system behavior on a simultaneous multithreaded architecture
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
The Memory Performance of DSS Commercial Workloads in Shared-Memory Multiprocessors
HPCA '97 Proceedings of the 3rd IEEE Symposium on High-Performance Computer Architecture
Memory System Behavior of Java-Based Middleware
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
An Architectural Evaluation of Java TPC-W
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Proceedings of the 33rd annual international symposium on Computer Architecture
POWER5 System microarchitecture
IBM Journal of Research and Development - POWER5 and packaging
Virtual hierarchies to support server consolidation
Proceedings of the 34th annual international symposium on Computer architecture
POWER4 system microarchitecture
IBM Journal of Research and Development
Dynamic information-based scalable hashing on a cluster of web cache servers
Concurrency and Computation: Practice & Experience
Towards energy-proportional datacenter memory with mobile DRAM
Proceedings of the 39th Annual International Symposium on Computer Architecture
Hi-index | 0.00 |
Computer manufacturers offer today multicore with multi-threading capabilities and a broad range of number of cores. An important market today for these multicores is in the server domain. Web servers are a class of servers which are widely used to provide access to files and also as front-ends of more complex services. In this paper the performance of Apache web server is characterized on multicore chips using Specweb2005 as URL request generator. This benchmark provides three workloads in order to characterize different usage environments. We also compare its performance against Surge that simulates a static web page URL request generator. We find that the L2 data miss rate per instruction is below 1.4%, more than the 60% of the misses are classified as cold or capacity misses and the true sharing misses represent between 12% and 38% of all the misses. We observe that though the data miss rate is small, accesses to main memory represent up to 42% of the execution time. By contrast the true sharing misses that could be up to 38% of all the misses, represent a small fraction of time due to the small latency of cache-to-cache transfers inside the chip.