Memory access buffering in multiprocessors
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Transactional memory: architectural support for lock-free data structures
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Proceedings of the fourteenth annual ACM symposium on Principles of distributed computing
Synthesis of operation-centric hardware descriptions
Proceedings of the 2000 IEEE/ACM international conference on Computer-aided design
Modular scheduling of guarded atomic actions
Proceedings of the 41st annual Design Automation Conference
Hardware synthesis from guarded atomic actions with performance specifications
ICCAD '05 Proceedings of the 2005 IEEE/ACM International conference on Computer-aided design
Bulk Disambiguation of Speculative Threads in Multiprocessors
Proceedings of the 33rd annual international symposium on Computer Architecture
An effective hybrid transactional memory system with strong isolation guarantees
Proceedings of the 34th annual international symposium on Computer architecture
Scheduling as Rule Composition
MEMOCODE '07 Proceedings of the 5th IEEE/ACM International Conference on Formal Methods and Models for Codesign
Parallel operation in the control data 6600
AFIPS '64 (Fall, part II) Proceedings of the October 27-29, 1964, fall joint computer conference, part II: very high speed computer systems
Automatic generation of hardware/software interfaces
ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
Leveraging rule-based designs for automatic power domain partitioning
Proceedings of the International Conference on Computer-Aided Design
Hi-index | 0.00 |
One solution to the timing closure problem is to perform infrequent operations in more than one cycle. Despite simplicity of the solution statement, it is not easily considered because it requires changes in RTL, which, in turn, exacerbates the verification problem. We offer a timing closure solution guaranteed to preserve functional correctness of designs expressed using atomic actions or rules. We exploit the fact that the semantics of atomic actions are untimed, that is, the time to execute an action is not specified. The current hardware synthesis technique from atomic actions assumes that each rule takes one clock cycle to complete its computation. Consequently, the rule with the longest combinational path determines the clock cycle of the entire design, often leading to needlessly slow circuits. We present a synthesis procedure for a system where the combinational circuits embodied in a rule can take multiple cycles without changing the semantics of the original design. We also present preliminary results based on an experimental compiler which uses the Bluespec (BSV) compiler front end and generates Verilog. The results show that the clock speed and the performance of circuits can be improved substantially by allowing slow paths to complete over multiple cycles. Our technique is orthogonal to solutions based on multiple clock domains.