ACM Computing Surveys (CSUR)
ISLPED '01 Proceedings of the 2001 international symposium on Low power electronics and design
Loop fusion for clustered VLIW architectures
Proceedings of the joint conference on Languages, compilers and tools for embedded systems: software and compilers for embedded systems
Design Methodology for a Tightly Coupled VLIW/Reconfigurable Matrix Architecture: A Case Study
Proceedings of the conference on Design, automation and test in Europe - Volume 2
Loop scheduling with timing and switching-activity minimization for VLIW DSP
ACM Transactions on Design Automation of Electronic Systems (TODAES)
SODA: A Low-power Architecture For Software Radio
Proceedings of the 33rd annual international symposium on Computer Architecture
EURASIP Journal on Applied Signal Processing
Relaxed K-best MIMO signal detector design and VLSI implementation
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Algorithm and implementation of the K-best sphere decoding for MIMO detection
IEEE Journal on Selected Areas in Communications
Finite precision processing in wireless applications
Proceedings of the Conference on Design, Automation and Test in Europe
Proceedings of the Conference on Design, Automation and Test in Europe
VLSI implementation of a fixed-complexity soft-output MIMO detector for high-speed wireless
EURASIP Journal on Wireless Communications and Networking
Energy Aware Signal Processing for Software Defined Radio Baseband Implementation
Journal of Signal Processing Systems
Implementation of a High-Speed MIMO Soft-Output Symbol Detector for Software Defined Radio
Journal of Signal Processing Systems
Exploration of Soft-Output MIMO Detector Implementations on Massive Parallel Processors
Journal of Signal Processing Systems
Fast performance evaluation of fixed-point systems with un-smooth operators
Proceedings of the International Conference on Computer-Aided Design
Microprocessors & Microsystems
Hi-index | 0.00 |
ML and near-ML MIMO detectors have attracted a lot of interest in recent years. However, almost all the reported implementations are delivered in ASICs or FPGAs. Our contribution is optimizing the near-ML MIMO detector for parallel programmable architectures, such as those with ILP and DLP features. In the proposed SSFE (Selective Spanning with Fast Enumeration), architecture-friendliness is explicitly introduced from the very beginning of the design flow. Importantly, high level algorithmic transformations make the dataflow pattern and structure fit architecture-characteristics very well. We enable abundant vector-parallelism with highly regular and deterministic dataflow in the SSFE; memory rearrangements, shuffling and non-predictable dynamism are all elaborately excluded. Hence, the SSFE can be easily parallelized and efficiently mapped onto ILP and DLP architectures. Furthermore, to fine-tune the SSFE on parallel architectures, extensive pre-compiler transformations are applied with the help of the application-level information. These optimize not only computation-operations but also address-generations and memory-accesses. Experiments show that the SSFE brings very efficient resource-utilizations on real-life VLIW architectures. Specifically, with the SSFE the percentage of NOPs instructions on VLIW is below 1%, even better than that achieved by the software-pipelined FFT. To the best of our knowledge, this is the first reported work about comprehensive optimizations of near-ML MIMO detectors for parallel programmable architectures.