TRAPEDS: producing traces for multicomputers via execution driven simulation
SIGMETRICS '89 Proceedings of the 1989 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Data prefetching in multiprocessor vector cache memories
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
An effective on-chip preloading scheme to reduce data access penalty
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Prefetch unit for vector operations on scalar computers (abstract)
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
The performance impact of block sizes and fetch strategies
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Generation and analysis of very long address traces
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
ACM Computing Surveys (CSUR)
ICS '93 Proceedings of the 7th international conference on Supercomputing
Effects of memory latencies on non-blocking processor/cache architectures
ICS '93 Proceedings of the 7th international conference on Supercomputing
Reducing cache conflicts in data cache prefetching
ACM SIGARCH Computer Architecture News - Special issue on input/output in parallel computer systems
Evaluating stream buffers as a secondary cache replacement
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
A performance study of software and hardware data prefetching schemes
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
ACM SIGARCH Computer Architecture News
Hardware implementation issues of data prefetching
ICS '95 Proceedings of the 9th international conference on Supercomputing
A data cache with multiple caching strategies tuned to different types of locality
ICS '95 Proceedings of the 9th international conference on Supercomputing
A limit study of local memory requirements using value reuse profiles
Proceedings of the 28th annual international symposium on Microarchitecture
An effective programmable prefetch engine for on-chip caches
Proceedings of the 28th annual international symposium on Microarchitecture
Evaluation of Hardware-Based Stride and Sequential Prefetching in Shared-Memory Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
Memory bandwidth limitations of future microprocessors
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Examination of a memory access classification scheme for pointer-intensive and numeric programs
ICS '96 Proceedings of the 10th international conference on Supercomputing
Tango: a hardware-based data prefetching technique for superscalar processors
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Improving data cache performance by pre-executing instructions under a cache miss
ICS '97 Proceedings of the 11th international conference on Supercomputing
Speculative execution via address prediction and data prefetching
ICS '97 Proceedings of the 11th international conference on Supercomputing
Memory-system design considerations for dynamically-scheduled processors
Proceedings of the 24th annual international symposium on Computer architecture
Run-time spatial locality detection and optimization
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Prediction caches for superscalar processors
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Characterization and improvement of load/store cache-based prefetching
ICS '98 Proceedings of the 12th international conference on Supercomputing
Using value prediction to increase the power of speculative execution hardware
ACM Transactions on Computer Systems (TOCS)
Investigating optimal local memory performance
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
An Algorithm for Optimally Exploiting Spatial and Temporal Locality in Upper Memory Levels
IEEE Transactions on Computers - Special issue on cache memory and related problems
An Integrated Hardware/Software Data Prefetching Scheme for Shared-Memory Multiprocessors
International Journal of Parallel Programming
IEEE Transactions on Computers
Hardware spatial forwarding for widely shared data
Proceedings of the 14th international conference on Supercomputing
ACM Computing Surveys (CSUR)
Minimizing Average Schedule Length under Memory Constraints by Optimal Partitioning and Prefetching
Journal of VLSI Signal Processing Systems
ICS '01 Proceedings of the 15th international conference on Supercomputing
Going the distance for TLB prefetching: an application-driven study
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Sunder: a programmable hardware prefetch architecture for numerical loops
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Effective Hardware-Based Data Prefetching for High-Performance Processors
IEEE Transactions on Computers
Increasing hardware data prefetching performance using the second-level cache
Journal of Systems Architecture: the EUROMICRO Journal
Data remapping for design space optimization of embedded memory systems
ACM Transactions on Embedded Computing Systems (TECS)
Stride-directed Prefetching for Secondary Caches
ICPP '97 Proceedings of the international Conference on Parallel Processing
An adaptive sequential prefetching scheme in shared-memory multiprocessors
ICPP '97 Proceedings of the international Conference on Parallel Processing
A Hardware Scheme for Data Prefetching
HPCN Europe 2000 Proceedings of the 8th International Conference on High-Performance Computing and Networking
Caches with Compositional Performance
Embedded Processor Design Challenges: Systems, Architectures, Modeling, and Simulation - SAMOS
Speculative Prefetching of Induction Pointers
CC '01 Proceedings of the 10th International Conference on Compiler Construction
Caches with compositional performance
Embedded processor design challenges
Improving Performance for Software MPEG Players
COMPCON '96 Proceedings of the 41st IEEE International Computer Conference
Distributed Prefetch-buffer/Cache Design for High Performance Memory Systems
HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
Approximating the optimal replacement algorithm
Proceedings of the 1st conference on Computing frontiers
ACM Transactions on Computer Systems (TOCS)
Effective stream-based and execution-based data prefetching
Proceedings of the 18th annual international conference on Supercomputing
Compiler orchestrated prefetching via speculation and predication
ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
MicroLib: A Case for the Quantitative Comparison of Micro-Architecture Mechanisms
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Tolerating memory latency through push prefetching for pointer-intensive applications
ACM Transactions on Architecture and Code Optimization (TACO)
On the performance of trace locality of reference
Performance Evaluation - Performance modelling and evaluation of high-performance parallel and distributed systems
On the importance of optimizing the configuration of stream prefetchers
Proceedings of the 2005 workshop on Memory system performance
Dynamic memory optimization using pool allocation and prefetching
ACM SIGARCH Computer Architecture News - Special issue on the 2005 workshop on binary instrumentation and application
Simple penalty-sensitive replacement policies for caches
Proceedings of the 3rd conference on Computing frontiers
Pattern-driven prefetching for multimedia applications on embedded processors
Journal of Systems Architecture: the EUROMICRO Journal
Efficient emulation of hardware prefetchers via event-driven helper threading
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Overlapping dependent loads with addressless preload
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
A PAB-based multi-prefetcher mechanism
International Journal of Parallel Programming
Future execution: A prefetching mechanism that uses multiple cores to speed up single threads
ACM Transactions on Architecture and Code Optimization (TACO)
Reducing Cache Pollution via Dynamic Data Prefetch Filtering
IEEE Transactions on Computers
Analysis of hardware prefetching across virtual page boundaries
Proceedings of the 4th international conference on Computing frontiers
HAT-trie: a cache-conscious trie-based data structure for strings
ACSC '07 Proceedings of the thirtieth Australasian conference on Computer science - Volume 62
Heterogeneous associative cache for multimedia applications
IMSA'07 IASTED European Conference on Proceedings of the IASTED European Conference: internet and multimedia systems and applications
Optimization of frequent itemset mining on multiple-core processor
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Focused prefetching: performance oriented prefetching based on commit stalls
Proceedings of the 22nd annual international conference on Supercomputing
Automatic Prefetching with Binary Code Rewriting in Object-Based DSMs
Euro-Par '08 Proceedings of the 14th international Euro-Par conference on Parallel Processing
Stream chaining: exploiting multiple levels of correlation in data prefetching
Proceedings of the 36th annual international symposium on Computer architecture
Efficient Data Access Management for FPGA-Based Image Processing SoCs
RSP '09 Proceedings of the 2009 IEEE/IFIP International Symposium on Rapid System Prototyping
CASES '09 Proceedings of the 2009 international conference on Compilers, architecture, and synthesis for embedded systems
Heterogeneous associative cache for multimedia applications
EurolMSA '07 Proceedings of the Third IASTED European Conference on Internet and Multimedia Systems and Applications
Engineering scalable, cache and space efficient tries for strings
The VLDB Journal — The International Journal on Very Large Data Bases
Improving cache locality for thread-level speculation
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Redesigning the string hash table, burst trie, and BST to exploit cache
Journal of Experimental Algorithmics (JEA)
Many-Thread Aware Prefetching Mechanisms for GPGPU Applications
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Extended histories: improving regularity and performance in correlation prefetchers
Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers
Resolving a L2-prefetch-caused parallel nonscaling on Intel Core microarchitecture
Journal of Parallel and Distributed Computing
Using runtime activity to dynamically filter out inefficient data prefetches
Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
Optimizing integrated application performance with cache-aware metascheduling
OTM'11 Proceedings of the 2011th Confederated international conference on On the move to meaningful internet systems - Volume Part II
A high performance heterogeneous architecture and its optimization design
HPCC'06 Proceedings of the Second international conference on High Performance Computing and Communications
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Exploration of 3D grid caching strategies for ray-shooting
Journal of Real-Time Image Processing
ACSAC'05 Proceedings of the 10th Asia-Pacific conference on Advances in Computer Systems Architecture
Cache-Conscious collision resolution in string hash tables
SPIRE'05 Proceedings of the 12th international conference on String Processing and Information Retrieval
Unified memory optimizing architecture: memory subsystem control with a unified predictor
Proceedings of the 26th ACM international conference on Supercomputing
Boosting mobile GPU performance with a decoupled access/execute fragment processor
Proceedings of the 39th Annual International Symposium on Computer Architecture
Esodyp+: prefetching in the Jackal software DSM
Euro-Par'07 Proceedings of the 13th international Euro-Par conference on Parallel Processing
Exploiting single-usage for effective memory management
ACSAC'07 Proceedings of the 12th Asia-Pacific conference on Advances in Computer Systems Architecture
APOGEE: adaptive prefetching on GPUs for energy efficiency
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Practical models for energy-efficient prefetching in mobile embedded systems
Microprocessors & Microsystems
Hi-index | 0.01 |