Highly concurrent scalar processing
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
HPS, a new microarchitecture: rationale and introduction
MICRO 18 Proceedings of the 18th annual workshop on Microprogramming
Hardware support for large atomic units in dynamically scheduled machines
MICRO 21 Proceedings of the 21st annual workshop on Microprogramming and microarchitecture
Available instruction-level parallelism for superscalar and superpipelined machines
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
IEEE Transactions on Computers
Cache and memory hierarchy design: a performance-directed approach
Cache and memory hierarchy design: a performance-directed approach
Machine organization of the IBM RISC System/6000 processor
IBM Journal of Research and Development
IEEE Spectrum
Cache performance of the integer SPEC benchmarks on a RISC
ACM SIGARCH Computer Architecture News
Trace-driven simulations for a two-level cache design in open bus systems
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Boosting beyond static scheduling in a superscalar processor
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
ACM Computing Surveys (CSUR)
Bibliography and reading on CPU cache memories and related topics
ACM SIGARCH Computer Architecture News
Using cache memory to reduce processor-memory traffic
ISCA '83 Proceedings of the 10th annual international symposium on Computer architecture
Decoupled access/execute computer architectures
ISCA '82 Proceedings of the 9th annual symposium on Computer Architecture
Lockup-free instruction fetch/prefetch cache organization
ISCA '81 Proceedings of the 8th annual symposium on Computer Architecture
MICRO 24 Proceedings of the 24th annual international symposium on Microarchitecture
Data access microarchitectures for superscalar processors with compiler-assisted data prefetching
MICRO 24 Proceedings of the 24th annual international symposium on Microarchitecture
Delayed consistency and its effects on the miss rate of parallel programs
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
The expandable split window paradigm for exploiting fine-grain parallelsim
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Reducing memory latency via non-blocking and prefetching caches
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Tradeoffs in processor/memory interfaces for superscalar processors
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Pseudo vector processor based on register-windowed superscalar pipeline
Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Balanced scheduling: instruction scheduling when memory latency is uncertain
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Cache write policies and performance
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Limitations of cache prefetching on a bus-based multiprocessor
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
A scalar architecture for pseudo vector processing based on slide-windowed registers
ICS '93 Proceedings of the 7th international conference on Supercomputing
Tradeoffs in two-level on-chip caching
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Exploring the design space for a shared-cache multiprocessor
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Complexity/performance tradeoffs with non-blocking loads
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Effective cache prefetching on bus-based multiprocessors
ACM Transactions on Computer Systems (TOCS)
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Simultaneous multithreading: maximizing on-chip parallelism
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
An analytical model of high performance superscalar-based multiprocessors
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
Increasing cache bandwidth using multi-port caches for exploiting ILP in non-numerical code
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
SPAID: software prefetching in pointer- and call-intensive environments
Proceedings of the 28th annual international symposium on Microarchitecture
Cache miss heuristics and preloading techniques for general-purpose programs
Proceedings of the 28th annual international symposium on Microarchitecture
Memory bandwidth limitations of future microprocessors
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Increasing cache port efficiency for dynamic superscalar microprocessors
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
High-bandwidth address translation for multiple-issue processors
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
The case for a single-chip multiprocessor
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Improving single-process performance with multithreaded processors
ICS '96 Proceedings of the 10th international conference on Supercomputing
Tango: a hardware-based data prefetching technique for superscalar processors
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Constructing instruction traces from cache-filtered address traces (CITCAT)
ACM SIGARCH Computer Architecture News
Two-ported cache alternatives for superscalar processors
MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
A study on the number of memory ports in multiple instruction issue machines
MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
Increasing memory bandwidth with wide buses: compiler, hardware and performance trade-offs
ICS '97 Proceedings of the 11th international conference on Supercomputing
Data caches for superscalar processors
ICS '97 Proceedings of the 11th international conference on Supercomputing
The design and analysis of a cache architecture for texture mapping
Proceedings of the 24th annual international symposium on Computer architecture
Designing high bandwidth on-chip caches
Proceedings of the 24th annual international symposium on Computer architecture
Run-time adaptive cache hierarchy management via reference analysis
Proceedings of the 24th annual international symposium on Computer architecture
On high-bandwidth data cache design for multi-issue processors
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Streamlining inter-operation memory communication via data dependence prediction
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Resource widening versus replication: limits and performance-cost trade-off
ICS '98 Proceedings of the 12th international conference on Supercomputing
Utilizing reuse information in data cache management
ICS '98 Proceedings of the 12th international conference on Supercomputing
Retrospective: instruction issue logic for high-performance, interruptable pipelined processors
25 years of the international symposia on Computer architecture (selected papers)
Simultaneous multithreading: maximizing on-chip parallelism
25 years of the international symposia on Computer architecture (selected papers)
Effects of Multithreading on Cache Performance
IEEE Transactions on Computers - Special issue on cache memory and related problems
Proceedings of the 1999 ACM symposium on Applied computing
Decoupling local variable accesses in a wide-issue superscalar processor
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
The Superthreaded Processor Architecture
IEEE Transactions on Computers
Access region locality for high-bandwidth processor memory system design
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Active Management of Data Caches by Exploiting Reuse Information
IEEE Transactions on Computers
IEEE Transactions on Computers
Hardware spatial forwarding for widely shared data
Proceedings of the 14th international conference on Supercomputing
High Bandwidth On-Chip Cache Design
IEEE Transactions on Computers
Dead-block prediction & dead-block correlating prefetchers
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
A High-Bandwidth Memory Pipeline for Wide Issue Processors
IEEE Transactions on Computers
Improving Latency Tolerance of Multithreading through Decoupling
IEEE Transactions on Computers
Facilitating level three cache studies using set sampling
Proceedings of the 32nd conference on Winter simulation
Reducing the complexity of the register file in dynamic superscalar processors
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Optimizing a Superscalar Machine to Run Vector Code
IEEE Parallel & Distributed Technology: Systems & Technology
A Memory Controller for Improved Performance of Streamed Computations on Symmetric Multiprocessors
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Microprocessors - 10 Years Back, 10 Years Ahead
Informatics - 10 Years Back. 10 Years Ahead.
Access ordering and memory-conscious cache utilization
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Program balance and its impact on high performance RISC architectures
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Just Say No: Benefits of Early Cache Miss Determination
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
The Inaccuracy of Trace-Driven Simulation Using Incomplete Multiprogramming Trace Data
MASCOTS '96 Proceedings of the 4th International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems
Balanced scheduling: instruction scheduling when memory latency is uncertain
ACM SIGPLAN Notices - Best of PLDI 1979-1999
Wire Delay is Not a Problem for SMT (In the Near Future)
Proceedings of the 31st annual international symposium on Computer architecture
Store Buffer Design in First-Level Multibanked Data Caches
Proceedings of the 32nd annual international symposium on Computer Architecture
A NUCA substrate for flexible CMP cache sharing
Proceedings of the 19th annual international conference on Supercomputing
Exploiting the replication cache to improve performance for multiple-issue microprocessors
ACM SIGARCH Computer Architecture News - Special issue: MEDEA 2004 workshop
Exploiting the replication cache to improve cache read bandwidth cost effectively
MEDEA '05 Proceedings of the 2005 workshop on MEmory performance: DEaling with Applications , systems and architecture
Scalable Cache Miss Handling for High Memory-Level Parallelism
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
I-cache multi-banking and vertical interleaving
Proceedings of the 17th ACM Great Lakes symposium on VLSI
ICPP '94 Proceedings of the 1994 International Conference on Parallel Processing - Volume 01
3D-Stacked Memory Architectures for Multi-core Processors
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Parallel Memory Architecture for Application-Specific Instruction-Set Processors
Journal of Signal Processing Systems
Access region cache with register guided memory reference partitioning
Journal of Systems Architecture: the EUROMICRO Journal
A load-instruction unit for pipelined processors
IBM Journal of Research and Development
Parallel memory architecture for TTA processor
SAMOS'07 Proceedings of the 7th international conference on Embedded computer systems: architectures, modeling, and simulation
Characterization of Fixed and Reconfigurable Multi-Core Devices for Application Acceleration
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Dynamic partition of memory reference instructions – a register guided approach
Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
A high performance adaptive miss handling architecture for chip multiprocessors
Transactions on High-Performance Embedded Architectures and Compilers IV
FPGA based efficient on-chip memory for image processing algorithms
Microelectronics Journal
Virtually split cache: An efficient mechanism to distribute instructions and data
ACM Transactions on Architecture and Code Optimization (TACO)
Hi-index | 0.02 |