ASPLOS II Proceedings of the second international conference on Architectual support for programming languages and operating systems
A set of level 3 basic linear algebra subprograms
ACM Transactions on Mathematical Software (TOMS)
Code generation for streaming: an access/execute mechanism
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Evaluation of the WM architecture
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Evaluating stream buffers as a secondary cache replacement
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Access ordering and effective memory bandwidth
Access ordering and effective memory bandwidth
Maximizing memory bandwidth for streamed computations
Maximizing memory bandwidth for streamed computations
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
PIPE: a VLSI decoupled architecture
ISCA '85 Proceedings of the 12th annual international symposium on Computer architecture
Access ordering and memory-conscious cache utilization
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
A performance comparison of contemporary DRAM architectures
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Hardware-only stream prefetching and dynamic access ordering
Proceedings of the 14th international conference on Supercomputing
Algorithmic foundations for a parallel vector access memory system
Proceedings of the twelfth annual ACM symposium on Parallel algorithms and architectures
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
High-Performance DRAMs in Workstation Environments
IEEE Transactions on Computers
Performance of the Complex Streamed Instruction Set on Image Processing Kernels
Euro-Par '01 Proceedings of the 7th International Euro-Par Conference Manchester on Parallel Processing
Efficient orchestration of sub-word parallelism in media processors
Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architectures
Efficient address remapping in distributed shared-memory systems
ACM Transactions on Architecture and Code Optimization (TACO)
ALP: Efficient support for all levels of parallelism for complex media applications
ACM Transactions on Architecture and Code Optimization (TACO)
Impulse: Memory system support for scientific applications
Scientific Programming
Scalable barrier synchronisation for large-scale shared-memory multiprocessors
International Journal of High Performance Computing and Networking
Hi-index | 0.00 |