Instruction overhead and data locality effects in superscalar processors

Authors:
M. Annavaram;G. S. Tyson;E. S. Davidson
Affiliations:
Dept. of Electr. Eng. & Comput. Sci., Michigan Univ., Ann Arbor, MI, USA;-;-
Venue:
ISPASS '00 Proceedings of the 2000 IEEE International Symposium on Performance Analysis of Systems and Software
Year:
2000

Citing 0
Cited 1

Data prefetching by dependence graph precomputation

ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture

Quantified Score

Hi-index	0.00

Visualization

Abstract

To reduce software development and maintenance costs, programmers are increasingly using object oriented programming languages, such as C++, and relying on highly flexible data structures, such as linked lists. Object oriented programming languages provide features that help manage complex software systems, but object oriented programs tend to suffer increased instruction counts, e.g. due to generalized class implementations and many more calls to small functions. Using linked data structures increases programming flexibility by allowing easy addition and deletion of nodes, and by dynamically allocating memory to satisfy applications that use large memory space. However, successive elements in linked data structures may be allocated noncontinuously in memory, leading to poor spatial locality for list traversals which in turn increases cache misses and reduces performance. This paper evaluates the impact of both the increased instruction overhead and poor spatial locality on superscalar processor performance as issue width increases. We show that underutilized resources of wide-issue processors can partially alleviate the impact of the instruction overhead. However, poor locality tends to cause more performance degradation as the processor issue width increases. Finally we show that the spatial locality of some programs can be improved by using a vector representation to replace linked list structures. Vectors exhibit better spatial locality during list traversals, but suffer from instruction overhead and memory copy overhead when nodes are added to and deleted from the structure.