Dependence-based program analysis
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
SUIF: an infrastructure for research on parallelizing and optimizing compilers
ACM SIGPLAN Notices
Beyond induction variables: detecting and classifying sequences using a demand-driven SSA form
ACM Transactions on Programming Languages and Systems (TOPLAS)
Accurate static branch prediction by value range propagation
PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
A general method for compiling event-driven simulations
DAC '95 Proceedings of the 32nd annual ACM/IEEE Design Automation Conference
PRISC: programmable reduced instruction set computers
PRISC: programmable reduced instruction set computers
The filter cache: an energy efficient memory structure
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Array SSA form and its use in parallelization
POPL '98 Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Space-time scheduling of instruction-level parallelism on a raw machine
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Maps: a compiler-managed memory system for raw machines
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Automatic parallelization of divide and conquer algorithms
Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Pointer analysis for multithreaded programs
Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
Exploiting superword level parallelism with multimedia instruction sets
PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Parallelizing Applications into Silicon
FCCM '99 Proceedings of the Seventh Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Dynamically Exploiting Narrow Width Operands to Improve Processor Power and Performance
HPCA '99 Proceedings of the 5th International Symposium on High Performance Computer Architecture
High-level compilation for gate-reconfigurable architectures
High-level compilation for gate-reconfigurable architectures
Exploiting superword level parallelism with multimedia instruction sets
PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Symbolic bounds analysis of pointers, array indices, and accessed memory regions
PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Frequent value locality and value-centric data cache design
ACM SIGPLAN Notices
Frequent value compression in data caches
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Precision and error analysis of MATLAB applications during automated hardware synthesis for FPGAs
Proceedings of the conference on Design, automation and test in Europe
Frequent value locality and value-centric data cache design
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
C Compiler Design for an Industrial Network Processor
OM '01 Proceedings of the 2001 ACM SIGPLAN workshop on Optimization of middleware and distributed systems
ED4I: Error Detection by Diverse Data and Duplicated Instructions
IEEE Transactions on Computers - Special issue on fault-tolerant embedded systems
An accelerated datapath width optimization scheme for area reduction of embedded systems
Proceedings of the 15th international symposium on System Synthesis
Bit section instruction set extension of ARM for embedded applications
CASES '02 Proceedings of the 2002 international conference on Compilers, architecture, and synthesis for embedded systems
Pointer analysis for structured parallel programs
ACM Transactions on Programming Languages and Systems (TOPLAS)
Bitwidth aware global register allocation
POPL '03 Proceedings of the 30th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
BitValue Inference: Detecting and Exploiting Narrow Bitwidth Computations
Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
A Representation for Bit Section Based Analysis and Optimization
CC '02 Proceedings of the 11th International Conference on Compiler Construction
Data Compression Transformations for Dynamically Allocated Data Structures
CC '02 Proceedings of the 11th International Conference on Compiler Construction
On Availability of Bit-Narrow Operations in General-Purpose Applications
FPL '00 Proceedings of the The Roadmap to Reconfigurable Computing, 10th International Workshop on Field-Programmable Logic and Applications
Automating Customisation of Floating-Point Designs
FPL '02 Proceedings of the Reconfigurable Computing Is Going Mainstream, 12th International Conference on Field-Programmable Logic and Applications
Exploiting On-Chip Memory Bandwidth in the VIRAM Compiler
IMS '00 Revised Papers from the Second International Workshop on Intelligent Memory Systems
Data size optimizations for java programs
Proceedings of the 2003 ACM SIGPLAN conference on Language, compiler, and tool for embedded systems
Synthesis of saturation arithmetic architectures
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Automatic Synthesis of Data Storage and Control Structures for FPGA-Based Computing Engines
FCCM '00 Proceedings of the 2000 IEEE Symposium on Field-Programmable Custom Computing Machines
OOPSLA '03 Proceedings of the 18th annual ACM SIGPLAN conference on Object-oriented programing, systems, languages, and applications
Simple offset assignment in presence of subword data
Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Software-Controlled Operand-Gating
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Speculative software management of datapath-width for energy optimization
Proceedings of the 2004 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Minimization of fractional wordlength on fixed-point conversion for high-level synthesis
Proceedings of the 2004 Asia and South Pacific Design Automation Conference
Proceedings of the 2nd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
HOIST: a system for automatically deriving static analyzers for embedded systems
ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Register Packing: Exploiting Narrow-Width Operands for Reducing Register File Pressure
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Symbolic bounds analysis of pointers, array indices, and accessed memory regions
ACM Transactions on Programming Languages and Systems (TOPLAS)
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
MP core: algorithm and design techniques for efficient channel estimation in wireless applications
Proceedings of the 42nd annual Design Automation Conference
An Algorithm for Trading Off Quantization Error with Hardware Resources for MATLAB-Based FPGA Design
IEEE Transactions on Computers
Précis: A Usercentric Word-Length Optimization Tool
IEEE Design & Test
Model Checking C Programs Using F-SOFT
ICCD '05 Proceedings of the 2005 International Conference on Computer Design
ASP-DAC '06 Proceedings of the 2006 Asia and South Pacific Design Automation Conference
Quality-driven design by bitwidth optimization for video applications
ASP-DAC '03 Proceedings of the 2003 Asia and South Pacific Design Automation Conference
Bitwidth-aware scheduling and binding in high-level synthesis
Proceedings of the 2005 Asia and South Pacific Design Automation Conference
Word-length optimization for differentiable nonlinear systems
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Smart bit-width allocation for low power optimization in a systemc based ASIC design environment
Proceedings of the conference on Design, automation and test in Europe: Proceedings
Deriving abstract transfer functions for analyzing embedded software
Proceedings of the 2006 ACM SIGPLAN/SIGBED conference on Language, compilers, and tool support for embedded systems
Pluggable abstract domains for analyzing embedded software
Proceedings of the 2006 ACM SIGPLAN/SIGBED conference on Language, compilers, and tool support for embedded systems
Offline compression for on-chip ram
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
Efficient design methods for embedded communication systems
EURASIP Journal on Embedded Systems
Optimum wordlength search using sensitivity information
EURASIP Journal on Applied Signal Processing
Design and implementation of numerical linear algebra algorithms on fixed point DSPs
EURASIP Journal on Advances in Signal Processing
A stochastic bitwidth estimation technique for compact and low-power custom processors
ACM Transactions on Embedded Computing Systems (TECS)
No bit left behind: the limits of heap data compression
Proceedings of the 7th international symposium on Memory management
May/must analysis and the DFAGen data-flow analysis generator
Information and Software Technology
Thermal-aware data flow analysis
Proceedings of the 46th Annual Design Automation Conference
Compiling for reconfigurable computing: A survey
ACM Computing Surveys (CSUR)
High-level synthesis for the design of FPGA-based signal processing systems
SAMOS'09 Proceedings of the 9th international conference on Systems, architectures, modeling and simulation
SQNR estimation of fixed-point DSP algorithms
EURASIP Journal on Advances in Signal Processing
Proceedings of the 2011 SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systems
Word-Length Aware DSP Hardware Design Flow Based on High-Level Synthesis
Journal of Signal Processing Systems
Dynamic elimination of overflow tests in a trace compiler
CC'11/ETAPS'11 Proceedings of the 20th international conference on Compiler construction: part of the joint European conferences on theory and practice of software
Design of multi-mode application-specific cores based on high-level synthesis
Integration, the VLSI Journal
Boosting the performance of multimedia applications using SIMD instructions
CC'05 Proceedings of the 14th international conference on Compiler Construction
Speculative subword register allocation in embedded processors
LCPC'04 Proceedings of the 17th international conference on Languages and Compilers for High Performance Computing
Overflow controlled SIMD arithmetic
LCPC'04 Proceedings of the 17th international conference on Languages and Compilers for High Performance Computing
Bit-sliced datapath for energy-efficient high performance microprocessors
PACS'04 Proceedings of the 4th international conference on Power-Aware Computer Systems
Enhanced bitwidth-aware register allocation
CC'06 Proceedings of the 15th international conference on Compiler Construction
ARCS'12 Proceedings of the 25th international conference on Architecture of Computing Systems
Speed and precision in range analysis
SBLP'12 Proceedings of the 16th Brazilian conference on Programming Languages
Improved bitwidth-aware variable packing
ACM Transactions on Architecture and Code Optimization (TACO)
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
ACM SIGOPS 24th Symposium on Operating Systems Principles
Towards optimization-safe systems: analyzing the impact of undefined behavior
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
Hi-index | 0.01 |
This paper introduces Bitwise, a compiler that minimizes the bitwidth the number of bits used to represent each operand for both integers and pointers in a program. By propagating 70static information both forward and backward in the program dataflow graph, Bitwise frees the programmer from declaring bitwidth invariants in cases where the compiler can determine bitwidths automatically. Because loop instructions comprise the bulk of dynamically executed instructions, Bitwise incorporates sophisticated loop analysis techniques for identifying bitwidths. We find a rich opportunity for bitwidth reduction in modern multimedia and streaming application workloads. For new architectures that support sub-word data-types, we expect that our bitwidth reductions will save power and increase processor performance. This paper also applies our analysis to silicon compilation, thetranslation of programs into custom hardware, to realize the full benefits of bitwidth reduction. We describe our integration of Bitwise with the DeepC Silicon Compiler. By taking advantage of bitwidth information during architectural synthesis, we reduce silicon real estate by 15 - 86%, improve clock speed by 3 - 249%, and reduce power by 46 - 73%. The next era of general purpose and reconfigurable architectures should strive to capture a portion of these gains.