Enhancing Speedup in Network Processing Applications by Exploiting Instruction Reuse with Flow Aggregation

Authors:
G. Surendra;Subhasis Banerjee;S. K. Nandy
Affiliations:
Indian Institute of Science;Indian Institute of Science;Indian Institute of Science
Venue:
DATE '03 Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
Year:
2003

Citing 7
Cited 1

Dynamic instruction reuse

Proceedings of the 24th annual international symposium on Computer architecture
Understanding the differences between value prediction and instruction reuse

MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
An empirical analysis of instruction repetition

Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Dynamic removal of redundant computations

ICS '99 Proceedings of the 13th international conference on Supercomputing
NetBench: a benchmarking suite for network processors

Proceedings of the 2001 IEEE/ACM international conference on Computer-aided design
Load Redundancy Removal through Instruction Reuse

ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
CommBench-a telecommunications benchmark for network processors

ISPASS '00 Proceedings of the 2000 IEEE International Symposium on Performance Analysis of Systems and Software

Minimal Multi-threading: Finding and Removing Redundant Instructions in Multi-threaded Processors

MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture

Quantified Score

Hi-index	0.00

Visualization

Abstract

Instruction reuse is a microarchitectural technique that improves the execution time of a program by removing redundant computations at run-time. Although this is the job of an optimizing compiler, they do not succeed many a time due to limited knowledge of run-time data. In this paper we examine instruction reuse of integer ALU and load instructions in network processing applications. Specifically, this paper attempts to answer the following questions: (1) How much of instruction reuse is inherent in network processing applications?, (2) Can reuse be improved by reducing interference in the reuse buffer?, (3) What characteristics of network applications can be exploited to improve reuse?, and (4) What is the effect of reuse on resource contention and memory accesses? We propose an aggregation scheme that combines the high-level concept of network traffic i.e. "flows" with a low level microarchitectural feature of programs i.e. repetition of instructions and data along with an architecture that exploits temporal locality in incoming packet data to improve reuse. We find that for the benchmarks considered, 1% to 50% of instructions are reused while the speedup achieved varies between 1% and 24%. As a side effect, instruction reuse reduces memory traffic and can therefore be considered as a scheme for low power.