Implementation of the least-squares lattice with order and forgetting factor estimation for FPGA

  • Authors:
  • Zdenek Pohl;Milan Tichy;Jiri Kadlec

  • Affiliations:
  • Institute of Information Theory and Automation, Prague, Czech Republic;Institute of Information Theory and Automation, Prague, Czech Republic;Institute of Information Theory and Automation, Prague, Czech Republic

  • Venue:
  • EURASIP Journal on Advances in Signal Processing
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

A high performance RLS lattice filter with the estimation of an unknown order and forgetting factor of identified system was developed and implemented as a PCORE coprocessor for Xilinx EDK. The coprocessor implemented in FPGA hardware can fully exploit parallelisms in the algorithm and remove load from a microprocessor. The EDK integration allows effective programming and debugging of hardware accelerated DSP applications. The RLS lattice core extended by the order and forgetting factor estimation was implemented using the logarithmic numbers system (LNS) arithmetic. An optimal mapping of the RLS lattice onto the LNS arithmetic units found by the cyclic scheduling was used. The schedule allows us to run four independent filters in parallel on one arithmetic macro set. The coprocessor containing the RLS lattice core is highly configurable. It allows to exploit the modular structure of the RLS lattice filter and construct the pipelined serial connection of filters for even higher performance. It also allows to run independent parallel filters on the same input with different forgetting factors in order to estimate which order and exponential forgetting factor better describe the observed data. The FPGA coprocessor implementation presented in the paper is able to evaluate the RLS lattice filter of order 504 at 12 kHz input data sampling rate. For the filter of order up to 20, the probability of order and forgetting factor hypotheses can be continually estimated. It has been demonstrated that the implemented coprocessor accelerates the Microblaze solution up to 20 times. It has also been shown that the coprocessor performs up to 2.5 times faster than highly optimized solution using 50 MIPS SHARC DSP processor, while the Microblaze is capable of performing another tasks concurrently.