Improving cache performance with balanced tag and data paths
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Area and performance tradeoffs in floating-point divide and square-root implementations
ACM Computing Surveys (CSUR)
Instruction scheduling for the HP PA-8000
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Increasing memory bandwidth with wide buses: compiler, hardware and performance trade-offs
ICS '97 Proceedings of the 11th international conference on Supercomputing
Speculative execution via address prediction and data prefetching
ICS '97 Proceedings of the 11th international conference on Supercomputing
Designing high bandwidth on-chip caches
Proceedings of the 24th annual international symposium on Computer architecture
Dynamic speculation and synchronization of data dependences
Proceedings of the 24th annual international symposium on Computer architecture
Data prefetching on the HP PA-8000
Proceedings of the 24th annual international symposium on Computer architecture
On high-bandwidth data cache design for multi-issue processors
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
The design and performance of a conflict-avoiding cache
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Prediction caches for superscalar processors
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Division Algorithms and Implementations
IEEE Transactions on Computers
Characterizing Distributed Shared Memory Performance: A Case Study of the Convex SPP1000
IEEE Transactions on Parallel and Distributed Systems
Speculative multithreaded processors
ICS '98 Proceedings of the 12th international conference on Supercomputing
Memory dependence prediction using store sets
Proceedings of the 25th annual international symposium on Computer architecture
Proceedings of the 25th annual international symposium on Computer architecture
Randomized Cache Placement for Eliminating Conflicts
IEEE Transactions on Computers - Special issue on cache memory and related problems
Speculation techniques for improving load related instruction scheduling
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Instruction fetch mechanisms for multipath execution processors
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Access region locality for high-bandwidth processor memory system design
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
ACM Transactions on Computer Systems (TOCS)
Performance analysis using the MIPS R10000 performance counters
Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
A High-Bandwidth Memory Pipeline for Wide Issue Processors
IEEE Transactions on Computers
VIS Speeds New Media Processing
IEEE Micro
Subword Parallelism with MAX-2
IEEE Micro
Architectural Considerations for Application-Specific Counterflow Pipelines
ARVLSI '99 Proceedings of the 20th Anniversary Conference on Advanced Research in VLSI
Mid-Range and High-End PA RISC Computer Systems
COMPCON '96 Proceedings of the 41st IEEE International Computer Conference
64-bit and Multimedia Extensions in the PA-RISC 2.0 Architecture
COMPCON '96 Proceedings of the 41st IEEE International Computer Conference
Instruction-level parallel processors-dynamic and static scheduling tradeoffs
PAS '97 Proceedings of the 2nd AIZU International Symposium on Parallel Algorithms / Architecture Synthesis
Sourcebook of parallel computing
Bridge floating-point fused multiply-add design
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Hi-index | 0.01 |
The PA-8000 is Hewlett-Packard's first CPU to implement the new 64-bit PA2.0 architecture. It combines a high clock frequency with a number of advanced microarchitectural features to deliver industry-leading performance on commercial and technical applications while maintaining full compatibility with all previous PA-RISC binaries. Among these advanced features are a fifty-six entry instruction reorder buffer to support out-of-order execution, a branch target address cache, branch history table, support for multiple outstanding cache misses and dual integer load/store, floating point multiply/accumulate, and divide/square root units which allow execution of four instructions per cycle. Together these features will enable the PA-8000 to sustain superscalar operation on a wide variety of workloads.