Optimizing Protocol Parameters to Large Scale PC Cluster and Evaluation of its Effectiveness with Parallel Data Mining

  • Authors:
  • Masato Oguchi;Takahiko Shintani;Takayuki Tamura;Masaru Kitsuregawa

  • Affiliations:
  • -;-;-;-

  • Venue:
  • HPDC '98 Proceedings of the 7th IEEE International Symposium on High Performance Distributed Computing
  • Year:
  • 1998

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recently, PC clusters have come to be studied intensively, for a large scale parallel computer in the next generation. ATM technology is a strong candidate as a de facto standard of high speed communication networks. Therefore an ATM connected PC cluster is very promising platform from the cost/performance point of view, as a future high performance computing environment.In this paper, an ATM connected PC cluster consists of 100 PCs is reported, and characteristics of a transport layer protocol for the PC cluster are evaluated. Point-to-point communication performance is measured and discussed, when a TCP window size parameter is changed. Retransmission caused by cell loss at the ATM switch is analyzed, and parameters of retransmission mechanism suitable for parallel processing on the large scale PC cluster are clarified.In the viewpoint of applications, data intensive applications such as data mining and ad-hoc query processing in databases are considered very important for massively parallel processors, in addition to the conventional scientific calculation. Thus investigating the feasibility of such applications on an ATM connected PC cluster is quite meaningful.Parallel data mining is implemented and evaluated on the cluster. Default TCP protocol cannot provide good performance, since a lot of collisions happen during all-to-all multicasting executed on the large scale PC cluster. Using TCP parameters according to the proposed optimization, sufficient performance improvement is achieved for parallel data mining on 100 PCs.