Parallel implementation of a neural net training application in a heterogeneous grid environment

Authors:
Rafael Menéndez De Llano;José Luis Bosque
Affiliations:
Departamento de Electrónica y Computadores, Universidad de Cantabria, Santander, Spain;Departamento de Electrónica y Computadores, Universidad de Cantabria, Santander, Spain
Venue:
OTM'07 Proceedings of the 2007 OTM confederated international conference on On the move to meaningful internet systems: CoopIS, DOA, ODBASE, GADA, and IS - Volume Part II
Year:
2007

Citing 10
Cited 0

Practical methods of optimization; (2nd ed.)

Practical methods of optimization; (2nd ed.)
What size net gives valid generalization?

Neural Computation
LAPACK's user's guide

LAPACK's user's guide
Parallel programming with MPI

Parallel programming with MPI
ScaLAPACK user's guide

ScaLAPACK user's guide
The Globus Project: A Status Report

HCW '98 Proceedings of the Seventh Heterogeneous Computing Workshop
MPICH-G2: a Grid-enabled implementation of the Message Passing Interface

Journal of Parallel and Distributed Computing - Special issue on computational grids
A Framework for Grid-based Neural Networks

DFMA '05 Proceedings of the First International Conference on Distributed Frameworks for Multimedia Applications
A grid based neural network execution service

PDCN'06 Proceedings of the 24th IASTED international conference on Parallel and distributed computing and networks
Block size selection of parallel LU and QR on PVP-based and RISC-based supercomputers

CHINA HPC '07 Proceedings of the 2007 Asian technology information program's (ATIP's) 3rd workshop on High performance computing in China: solution approaches to impediments for high performance computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

The emergence of Grid technology provides an unrivalled opportunity for large-scale high performance computing applications, in several scientific communities, for instance high-energy physics, astrophysics, meteorology, computational medicine. One of the high-energy applications, suitable for execution in a Grid environment due to its high requirements in data processing, is the implementation of an artificial neural net for searching for the Higgs's boson. Therefore, the aim of this work is to parallelize and evaluate the performance and the scalability of the kernel of a training algorithm of a multilayer perceptron artificial neural net for analysing data from the Large Electron Positron Collider at CERN. To carry out the training of the net there are a wide variety of iterative methods to converge towards the optimum values of the weights of the net. In our case the hybrid linear-BFGS method is used, which is based on the criteria of gradient descent. As for the training of the net, a first parallel implementation based on master-slave architecture was developed. In this scenario the slave nodes process the patterns and give an output with which the error value is calculated. On the other hand, the node acting as master collects the partial results, sums them and with this information, generates a linear equation system, which it then solves giving rise to the new weights that are distributed among the slaves for the next iteration. This first parallelization does not confer great scalability and provokes a bottleneck when it increases the size of the neural net since the master process saturates when trying to solve this large system of equations. For this reason a second parallelization is needed, where the slave nodes resolve the system of equations in a distributed way, avoiding the above bottleneck. This solution has been developed and will be evaluated in this work. This work has been developed utilising the MPI message passing library in its MPICH-G2 distribution in a heterogeneous Grid environment. In performance evaluation, the aim is to check if the parallel algorithm is suitable and scalable when executed in a heterogeneous Grid environment. The results obtained in different Grid environments are also compared with the result obtained in a sharedmemory supercomputer.