Cache coherence protocols: evaluation using a multiprocessor simulation model
ACM Transactions on Computer Systems (TOCS)
A class of compatible cache consistency protocols and their support by the IEEE futurebus
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Memory-reference characteristics of multiprocessor applications under MACH
SIGMETRICS '88 Proceedings of the 1988 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Address Tracing for Parallel Machines
Computer - Special issue on experimental research in computer architecture
Adaptive cache coherency for detecting migratory shared data
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
An adaptive cache coherence protocol optimized for migratory sharing
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
The accuracy of trace-driven simulations of multiprocessors
SIGMETRICS '93 Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Contrasting characteristics and cache performance of technical and multi-user commercial workloads
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Evaluating the performance of cache-affinity scheduling in shared-memory multiprocessors
Journal of Parallel and Distributed Computing
Memory system performance of UNIX on CC-NUMA multiprocessors
Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
An analysis of degenerate sharing and false coherence
Journal of Parallel and Distributed Computing
Trace-driven memory simulation: a survey
ACM Computing Surveys (CSUR)
Memory system characterization of commercial workloads
Proceedings of the 25th annual international symposium on Computer architecture
Performance characterization of a Quad Pentium Pro SMP using OLTP workloads
Proceedings of the 25th annual international symposium on Computer architecture
Computer architecture (2nd ed.): a quantitative approach
Computer architecture (2nd ed.): a quantitative approach
Pentium Pro and Pentium II system architecture (2nd ed.)
Pentium Pro and Pentium II system architecture (2nd ed.)
IEEE Transactions on Parallel and Distributed Systems
Infrastructure for electronic business on the Internet
Infrastructure for electronic business on the Internet
3 Tier Client/Server at Work
The Cache-Coherence Problem in Shared-Memory Multiprocessors: Hardware Solutions
The Cache-Coherence Problem in Shared-Memory Multiprocessors: Hardware Solutions
Trace Factory: Generating Workloads for Trace-Driven Simulation of Shared-Bus Multiprocessors
IEEE Parallel & Distributed Technology: Systems & Technology
Computer
False Sharing and Spatial Locality in Multiprocessor Caches
IEEE Transactions on Computers
Using Processor-Cache Affinity Information in Shared-Memory Multiprocessor Scheduling
IEEE Transactions on Parallel and Distributed Systems
Windows NT Clusters for Availability and Scalabilty
COMPCON '97 Proceedings of the 42nd IEEE International Computer Conference
The Memory Performance of DSS Commercial Workloads in Shared-Memory Multiprocessors
HPCA '97 Proceedings of the 3rd IEEE Symposium on High-Performance Computer Architecture
Detailed Characterization of a Quad Pentium Pro Server Running TPC-D
ICCD '99 Proceedings of the 1999 IEEE International Conference on Computer Design
An Architectural Evaluation of Java TPC-W
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Hi-index | 0.00 |
The focus of this paper is on analyzing the effectiveness of SMP (Symmetric Multi-Processor) architecture for implementing Three-Tier Web-Servers. In particular, we considered a workload based on the TPC-W benchmark to evaluate the system.As the major bottleneck of this system is accessing memory through the shared bus, we analyzed what are the benefits of adopting several solutions aimed at boosting the global performance of the Web Server. Our aim is also to quantify the scalability of such a system and suggest solutions to achieve the desired processing power. The analysis starts from a reference case, and explores different architectural choices as for cache, scheduling algorithm, and coherence protocol in order to increase the number of processors possibly connected through the shared bus.Our results show that such an SMP based server could be scaled (up to 20 processor) quite above the limits expected for this kind of architecture, if particular attention is used in solving problems related to process migration and coherence overhead.