Vector vs. superscalar and VLIW architectures for embedded multimedia benchmarks
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Exploring the VLSI Scalability of Stream Processors
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Overcoming the limitations of conventional vector processors
Proceedings of the 30th annual international symposium on Computer architecture
Proceedings of the 30th annual international symposium on Computer architecture
Matrix bidiagonalization: implementation and evaluation on the Trident processor
Neural, Parallel & Scientific Computations
The Vector-Thread Architecture
Proceedings of the 31st annual international symposium on Computer architecture
Implementing virtual memory in a vector processor with software restart markers
Proceedings of the 20th annual international conference on Supercomputing
The design space of data-parallel memory systems
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
ALP: Efficient support for all levels of parallelism for complex media applications
ACM Transactions on Architecture and Code Optimization (TACO)
A 64-bit stream processor architecture for scientific applications
Proceedings of the 34th annual international symposium on Computer architecture
Neural, Parallel & Scientific Computations
Sams: single-affiliation multiple-stride parallel memory scheme
Proceedings of the 2008 workshop on Memory access on future processors: a solved problem?
A highly efficient implementation of a backpropagation learning algorithm using matrix ISA
Journal of Parallel and Distributed Computing
VEGAS: soft vector processor with scratchpad memory
Proceedings of the 19th ACM/SIGDA international symposium on Field programmable gate arrays
Neural, Parallel & Scientific Computations
Accelerator compiler for the VENICE vector processor
Proceedings of the ACM/SIGDA international symposium on Field Programmable Gate Arrays
Mat-core: a decoupled matrix core extension for general-purpose processors
Neural, Parallel & Scientific Computations
Reducing memory access latency with asymmetric DRAM bank organizations
Proceedings of the 40th Annual International Symposium on Computer Architecture
Journal of Parallel and Distributed Computing
Portable, flexible, and scalable soft vector processors
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Hi-index | 0.00 |