T: a multithreaded massively parallel architecture
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Virtual memory mapped network interface for the SHRIMP multicomputer
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Advances in knowledge discovery and data mining
Advances in knowledge discovery and data mining
Hash based parallel algorithms for mining association rules
DIS '96 Proceedings of the fourth international conference on on Parallel and distributed information systems
SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
Communication Issues in Parallel Computing Across ATM Networks
IEEE Parallel & Distributed Technology: Systems & Technology
PM: An Operating System Coordinated High Performance Communication Library
HPCN Europe '97 Proceedings of the International Conference and Exhibition on High-Performance Computing and Networking
Performance of the MOSIX Parallel System for a Cluster of PCs
HPCN Europe '97 Proceedings of the International Conference and Exhibition on High-Performance Computing and Networking
Characteristics of a Parallel Data Mining Application Implemented on an ATM Connected PC Cluster
HPCN Europe '97 Proceedings of the International Conference and Exhibition on High-Performance Computing and Networking
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Communication overhead for space science applications on the Beowulf parallel workstation
HPDC '95 Proceedings of the 4th IEEE International Symposium on High Performance Distributed Computing
Commodity Clusters: Performance Comparison Between PC's and Workstations
HPDC '96 Proceedings of the 5th IEEE International Symposium on High Performance Distributed Computing
Hi-index | 0.00 |
Recently, PC clusters have come to be studied intensively for large scale parallel computers of the next generation. ATM technology is a strong candidate as a de facto standard of high speed communication networks. Therefore, an ATM-connected PC cluster is a promising platform from the cost/performance point of view, as a future high performance computing environment. Data intensive applications, such as data mining and ad hoc query processing in databases, are considered very important for massively parallel processors, as well as for conventional scientific calculations. Thus, investigating the feasibility of applications on an ATM-connected PC cluster is meaningful. In this paper, an ATM-connected PC cluster consisting of 100 PCs is reported, and characteristics of a transport layer protocol for the PC cluster are evaluated. Point-to-point communication performance is measured and discussed, when a TCP window size parameter is changed. Parallel data mining is implemented and evaluated on the cluster. Retransmission caused by cell loss at the ATM switch is analyzed, and parameters of retransmission mechanism suitable for parallel processing on the large scale PC cluster are clarified. Default TCP protocol cannot provide good performance, since a lot of collisions happen during all-to-all multicasting executed on the large scale PC cluster. Using TCP parameters with the proposed optimization, performance improvement is achieved for parallel data mining on 100 PCs.