Bit section instruction set extension of ARM for embedded applications

Authors:
Bengu Li;Rajiv Gupta
Affiliations:
The University of Arizona, Tucson, AZ;The University of Arizona, Tucson, AZ
Venue:
CASES '02 Proceedings of the 2002 international conference on Compilers, architecture, and synthesis for embedded systems
Year:
2002

Citing 13
Cited 8

Memory access coalescing: a technique for eliminating redundant memory accesses

PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
The SimpleScalar tool set, version 2.0

ACM SIGARCH Computer Architecture News
Bidwidth analysis with application to silicon compilation

PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Exploiting superword level parallelism with multimedia instruction sets

PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
C Compiler Design for an Industrial Network Processor

OM '01 Proceedings of the 2001 ACM SIGPLAN workshop on Optimization of middleware and distributed systems
ARM Architecture Reference Manual

ARM Architecture Reference Manual
ARM System Architecture

ARM System Architecture
NetBench: a benchmarking suite for network processors

Proceedings of the 2001 IEEE/ACM international conference on Computer-aided design
A Representation for Bit Section Based Analysis and Optimization

CC '02 Proceedings of the 11th International Conference on Compiler Construction
Data Compression Transformations for Dynamically Allocated Data Structures

CC '02 Proceedings of the 11th International Conference on Compiler Construction
Fast Subword Permutation Instructions Using Omega and Flip Network Stages

ICCD '00 Proceedings of the 2000 IEEE International Conference on Computer Design: VLSI in Computers & Processors
CommBench-a telecommunications benchmark for network processors

ISPASS '00 Proceedings of the 2000 IEEE International Symposium on Performance Analysis of Systems and Software

Bitwidth aware global register allocation

POPL '03 Proceedings of the 30th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Simple offset assignment in presence of subword data

Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Bit level types for high level reasoning

Proceedings of the 14th ACM SIGSOFT international symposium on Foundations of software engineering
Runtime resource allocation in multi-core packet processing systems

HPSR'09 Proceedings of the 15th international conference on High Performance Switching and Routing
Optimal bitwise register allocation using integer linear programming

LCPC'06 Proceedings of the 19th international conference on Languages and compilers for parallel computing
Global productiveness propagation: a code optimization technique to speculatively prune useless narrow computations

Proceedings of the 2011 SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systems
Speculative subword register allocation in embedded processors

LCPC'04 Proceedings of the 17th international conference on Languages and Compilers for High Performance Computing
Enhanced bitwidth-aware register allocation

CC'06 Proceedings of the 15th international conference on Compiler Construction

Quantified Score

Hi-index	0.02

Visualization

Abstract

Programs that manipulate data at subword level, i.e. bit sections within a word, are common place in the embedded domain. Examples of such applications include media processing as well as network processing codes. These applications spend significant amounts of time packing and unpacking narrow width data into memory words. The execution time and memory overhead of packing and unpacking operations can be greatly reduced by providing direct instruction set support for manipulating bit sections.In this paper we present the Bit Section eXtension (BSX) to the ARM instruction set. We selected the ARM processor for this research because it is one of the most popular embedded processor which is also being used as the basis of building many commercial network processing architectures. We present the design of BSX instructions and their encoding into the ARM instruction set. We have incorporated the implementation of BSX into the Simplescalar ARM simulator from Michigan. Results of experiments with programs from various benchmark suites show that by using BSX instructions the total number of instructions executed at runtime by many transformed functions are reduced by 4.26% to 27.27% and their code sizes are reduced by `1.27% to 21.05%.