Optimal Partitioning of Cache Memory
IEEE Transactions on Computers
Computer architecture (2nd ed.): a quantitative approach
Computer architecture (2nd ed.): a quantitative approach
Reconfigurable caches and their application to media processing
Proceedings of the 27th annual international symposium on Computer architecture
Vector vs. superscalar and VLIW architectures for embedded multimedia benchmarks
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Overcoming the limitations of conventional vector processors
Proceedings of the 30th annual international symposium on Computer architecture
Programmable Stream Processors
Computer
Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
Power Efficient Processor Architecture and The Cell Processor
HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
Shader Performance Analysis on a Modern GPU Architecture
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset
ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
IEEE Micro
Enabling real-time physics simulation in future interactive entertainment
Proceedings of the 2006 ACM SIGGRAPH symposium on Videogames
The PARSEC benchmark suite: characterization and architectural implications
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Tradeoffs in designing accelerator architectures for visual computing
Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Toward a multicore architecture for real-time ray-tracing
Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
PIPP: promotion/insertion pseudo-partitioning of multi-core shared caches
Proceedings of the 36th annual international symposium on Computer architecture
Fool me twice: Exploring and exploiting error tolerance in physics-based animation
ACM Transactions on Graphics (TOG)
ACM Transactions on Architecture and Code Optimization (TACO)
MEDICS: ultra-portable processing for medical image reconstruction
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Scalable shared-cache management by containing thrashing workloads
HiPEAC'10 Proceedings of the 5th international conference on High Performance Embedded Architectures and Compilers
Proceedings of the ACM International Conference on Computing Frontiers
Hi-index | 0.00 |
Future interactive entertainment applications will featurethe physical simulation of thousands of interacting objectsusing explosions, breakable objects, and cloth effects. Whilethese applications require a tremendous amount of performanceto satisfy the minimum frame rate of 30 FPS, there is a dramatic amount of parallelism in future physics workloads.How will future physics architectures leverage parallelismto achieve the real-time constraint?. We propose and characterize a set of forward-looking benchmarksto represent future physics load and explore the designspace of future physics processors. In response to thedemand of this workload, we demonstrate an architecturewith a set of powerful cores and caches to provide performancefor the serial and coarse-grain parallel components ofphysics simulation, along with a exible set of simple coresto exploit fine-grain parallelism. Our architecture combinesintelligent, application-aware L2 management with dynamiccoupling/allocation of simple cores to complex cores. Furthermore,we perform sensitivity analysis on interconnectalternatives to determine how tightly to couple these cores.