Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Implementing global memory management in a workstation cluster
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
Advances in knowledge discovery and data mining
Advances in knowledge discovery and data mining
Hash based parallel algorithms for mining association rules
DIS '96 Proceedings of the fourth international conference on on Parallel and distributed information systems
SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
Communication Issues in Parallel Computing Across ATM Networks
IEEE Parallel & Distributed Technology: Systems & Technology
Parallel and Distributed Association Mining: A Survey
IEEE Concurrency
Computer
Semantic Data Caching and Replacement
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Commodity Clusters: Performance Comparison Between PC's and Workstations
HPDC '96 Proceedings of the 5th IEEE International Symposium on High Performance Distributed Computing
Hi-index | 0.00 |
In this paper, we have constructed a large scale ATM-connected PC cluster consists of 100 PCs, implemented a data mining application, and optimized its execution environment. Default parameters of TCP retransmission mechanism cannot pro vide good performance for data mining application, since a lot of collisions occur in the case of all-to-all multicasting in the large scale PC cluster. Using a TCP retransmission parameters according to the proposed parameter optimization, reasonably good performance improvement is achiev ed for parallel data mining on 100 PCs.Association rule mining, one of the best-known problems in data mining, differs from conventional scientific calculations in its usage of main memory. We have investigated the feasibility of using available memory on remote nodes as a swap area when working nodes need to swap out their real memory contents. According to the experimental results on our PC cluster, the proposed method is expected to be considerably better than using hard disks as a swapping device.