Using threads in interactive systems: a case study
SOSP '93 Proceedings of the fourteenth ACM symposium on Operating systems principles
The measured performance of personal computer operating systems
ACM Transactions on Computer Systems (TOCS) - Special issue on operating system principles
The design and implementation of the 4.4BSD operating system
The design and implementation of the 4.4BSD operating system
Using latency to evaluate interactive system performance
OSDI '96 Proceedings of the second USENIX symposium on Operating systems design and implementation
Execution characteristics of desktop applications on Windows NT
Proceedings of the 25th annual international symposium on Computer architecture
Piranha: a scalable architecture based on single-chip multiprocessing
Proceedings of the 27th annual international symposium on Computer architecture
Thread-level parallelism and interactive performance of desktop applications
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
IEEE Micro
Analysis of Personal Computer Workloads
MASCOTS '99 Proceedings of the 7th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems
Process Scheduling for the Parallel Desktop
ISPAN '05 Proceedings of the 8th International Symposium on Parallel Architectures,Algorithms and Networks
Dynamic Estimation of Task Level Parallelism with Operating System Support
ISPAN '05 Proceedings of the 8th International Symposium on Parallel Architectures,Algorithms and Networks
Request extraction in Magpie: events, schemas and temporal joins
Proceedings of the 11th workshop on ACM SIGOPS European workshop
HotPar'09 Proceedings of the First USENIX conference on Hot topics in parallelism
Using simple abstraction to reinvent computing for parallelism
Communications of the ACM
CReAMS: an embedded multiprocessor platform
ARC'11 Proceedings of the 7th international conference on Reconfigurable computing: architectures, tools and applications
A QHD-capable parallel H.264 decoder
Proceedings of the international conference on Supercomputing
A file is not a file: understanding the I/O behavior of Apple desktop applications
SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
On the impact of serializing contention management on STM performance
Journal of Parallel and Distributed Computing
Trends and challenges in operating systems---from parallel computing to cloud computing
Concurrency and Computation: Practice & Experience
A File Is Not a File: Understanding the I/O Behavior of Apple Desktop Applications
ACM Transactions on Computer Systems (TOCS)
FastLane: improving performance of software transactional memory for low thread counts
Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
Bulk synchronous visualization
Proceedings of the 2013 International Workshop on Programming Models and Applications for Multicores and Manycores
Computational sprinting on a hardware/software testbed
Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
Rapid, low-power loop execution in a network of functional units
Proceedings of the 17th Panhellenic Conference on Informatics
HCI'13 Proceedings of the 15th international conference on Human-Computer Interaction: human-centred design approaches, methods, tools, and environments - Volume Part I
The benefit of SMT in the multi-core era: flexibility towards degrees of thread-level parallelism
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
Virtual asymmetric multiprocessor for interactive performance of consolidated desktops
Proceedings of the 10th ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Hi-index | 0.02 |
As the effective limits of frequency and instruction level parallelism have been reached, the strategy of microprocessor vendors has changed to increase the number of processing cores on a single chip each generation. The implicit expectation is that software developers will write their applications with concurrency in mind to take advantage of this sudden change in direction. In this study we analyze whether software developers for laptop/desktop machines have followed the recent hardware trends by creating software for chip multi-processing. We conduct a study of a wide range of applications on Microsoft Windows 7 and Apple's OS X Snow Leopard, measuring Thread Level Parallelism on a high performance workstation and a low power desktop. In addition, we explore graphics processing units (GPUs) and their impact on chip multi-processing. We compare our findings to a study done 10 years ago which concluded that a second core was sufficient to improve system responsiveness. Our results on today's machines show that, 10 years later, surprisingly 2-3 cores are more than adequate for most applications and that the GPU often remains under-utilized. However, in some application specific domains an 8 core SMT system with a 240 core GPU can be effectively utilized. Overall these studies suggest that many-core architectures are not a natural fit for current desktop/laptop applications.