Microwave tomography for breast cancer detection on Cell broadband engine processors

Authors:
Meilian Xu;Parimala Thulasiraman;Sima Noghanian
Affiliations:
Department of Computer Science, University of Manitoba, Winnipeg, Manitoba, Canada;Department of Computer Science, University of Manitoba, Winnipeg, Manitoba, Canada;Department of Electrical Engineering, University of North Dakota, USA
Venue:
Journal of Parallel and Distributed Computing
Year:
2012

Citing 15
Cited 1

Genetic algorithms + data structures = evolution programs (2nd, extended ed.)

Genetic algorithms + data structures = evolution programs (2nd, extended ed.)
A parallel genetic algorithm for the set partitioning problem

A parallel genetic algorithm for the set partitioning problem
An introduction to genetic algorithms

An introduction to genetic algorithms
Principles of computerized tomographic imaging

Principles of computerized tomographic imaging
Image-Processing Techniques for Tumor Detection

Image-Processing Techniques for Tumor Detection
A Novel FDTD Application Featuring OpenMP-MPI Hybrid Parallelization

ICPP '04 Proceedings of the 2004 International Conference on Parallel Processing
Cell Multiprocessor Communication Network: Built for Speed

IEEE Micro
Image Reconstruction using Microwave Tomography for Breast Cancer Detection on Distributed Memory Machine

ICPP '07 Proceedings of the 2007 International Conference on Parallel Processing
High performance combinatorial algorithm design on the Cell Broadband Engine processor

Parallel Computing
Accelerating computing with the cell broadband engine processor

Proceedings of the 5th conference on Computing frontiers
Parallel Algorithm Design and Performance Evaluation of FDTD on 3 Different Architectures: Cluster, Homogeneous Multicore and Cell/B.E.

HPCC '08 Proceedings of the 2008 10th IEEE International Conference on High Performance Computing and Communications
Exploiting Data Locality in FFT Using Indirect Swap Network on Cell/B.E.

HPCS '08 Proceedings of the 2008 22nd International Symposium on High Performance Computing Systems and Applications
Exploring the viability of the Cell Broadband Engine for bioinformatics applications

Parallel Computing
A view of the parallel computing landscape

Communications of the ACM - A View of Parallel Computing
On the partial difference equations of mathematical physics

IBM Journal of Research and Development

Editorial: Special issue editorial: Accelerators for high-performance computing

Journal of Parallel and Distributed Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Microwave tomography (MT) is a safe screening modality that can be used for breast cancer detection. The technique uses the dielectric property contrasts between different breast tissues at microwave frequencies to determine the existence of abnormalities. Our proposed MT approach is an iterative process that involves two algorithms: Finite-Difference Time-Domain (FDTD) and Genetic Algorithm (GA). It is a compute intensive problem: (i) the number of iterations can be quite large to detect small tumors; (ii) many fine-grained computations and discretizations of the object under screening are required for accuracy. In our earlier work, we developed a parallel algorithm for microwave tomography on CPU-based homogeneous, multi-core, distributed memory machines. The performance improvement was limited due to communication and synchronization latencies inherent in the algorithm. In this paper, we exploit the parallelism of microwave tomography on the Cell BE processor. Since FDTD is a numerical technique with regular memory accesses, intensive floating point operations and SIMD type operations, the algorithm can be efficiently mapped on the Cell processor achieving significant performance. The initial implementation of FDTD on Cell BE with 8 SPEs is 2.9 times faster than an eight node shared memory machine and 1.45 times faster than an eight node distributed memory machine. In this work, we modify the FDTD algorithm by overlapping computations with communications during asynchronous DMA transfers. The modified algorithm also orchestrates the computations to fully use data between DMA transfers to increase the computation-to-communication ratio. We see 54% improvement on 8 SPEs (27.9% on 1 SPE) for the modified FDTD in comparison to our original FDTD algorithm on Cell BE. We further reduce the synchronization latency between GA and FDTD by using mechanisms such as double buffering. We also propose a performance prediction model based on DMA transfers, number of instructions and operations, the processor frequency and DMA bandwidth. We show that the execution time from our prediction model is comparable (within 0.5 s difference) with the execution time of the experimental results on one SPE.