A multiple floating point coprocessor architecture

  • Authors:
  • Lawrence Rauchwerger;P. Michael Farmwald

  • Affiliations:
  • Center for Supercomputing Research and Development, University of Illinois at Urbana-Champaign, Urbana, IL;Center for Supercomputing Research and Development, University of Illinois at Urbana-Champaign, Urbana, IL

  • Venue:
  • MICRO 23 Proceedings of the 23rd annual workshop and symposium on Microprogramming and microarchitecture
  • Year:
  • 1990

Quantified Score

Hi-index 0.00

Visualization

Abstract

General purpose microprocessor based computers usually speed their arithmetic processing performance by using a floating point co-processor. Because adding more co-processors represents neither a technological nor a cost problem we investigated a system based on a MIPS R2000 [2] and 4 floating point units. In this paper we show a block diagram of such an implementation and how two important scientific operations can be accelerated using a single unmodified data bus. A large percentage of the engineering applications are solved with the help of linear algebra methods like BLAS3 [4] algorithms; It is precisely for these primitives that the proposed architecture brings significant performance gains. The first operation described will be a matrix multiplication algorithm, its timing diagram and some results, Next a polynomial evaluation technique will be examined. Finally we show how to use the same ideas with various other microprocessors.