Predictability of Process Resource Usage: A Measurement-Based Study on UNIX
IEEE Transactions on Software Engineering
From data mining to knowledge discovery: an overview
Advances in knowledge discovery and data mining
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
Mining Very Large Databases with Parallel Processing
Mining Very Large Databases with Parallel Processing
Communications of the ACM
A taxonomy of scheduling in general-purpose distributed computing systems
IEEE Transactions on Software Engineering
Generating Accurate Rule Sets Without Global Optimization
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
A Resource Management Architecture for Metacomputing Systems
IPPS/SPDP '98 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
JSSPP '02 Revised Papers from the 8th International Workshop on Job Scheduling Strategies for Parallel Processing
An Empirical Investigation of Load Indices for Load Balancing Applications
Performance '87 Proceedings of the 12th IFIP WG 7.3 International Symposium on Computer Performance Modelling, Measurement and Evaluation
Efficient Incremental Checkpointing of Java Programs
DSN '00 Proceedings of the 2000 International Conference on Dependable Systems and Networks (formerly FTCS-30 and DCCA-8)
Adapting to Load on Workstation Clusters
FRONTIERS '99 Proceedings of the The 7th Symposium on the Frontiers of Massively Parallel Computation
Dynamic Matching and Scheduling of a Class of Independent Tasks onto Heterogeneous Computing Systems
HCW '99 Proceedings of the Eighth Heterogeneous Computing Workshop
Scheduling attributes and platform LSF
Grid resource management
Experiences implementing efficient Java thread serialization, mobility and persistence
Software—Practice & Experience - Research Articles
Hierarchical Scheduling of Independent Tasks with Shared Files
CCGRID '06 Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid
Scheduling tasks sharing files on heterogeneous master-slave platforms
Journal of Systems Architecture: the EUROMICRO Journal - Special issue: Parallel, distributed and network-based processing
Transparent fault tolerance for grid applications
EGC'05 Proceedings of the 2005 European conference on Advances in Grid Computing
Scalability limits of Bag-of-Tasks applications running on hierarchical platforms
Journal of Parallel and Distributed Computing
Hi-index | 0.00 |
In this paper we present and evaluate Inhambu, a distributed object-oriented system that supports the execution of data mining applications on clusters of PCs and workstations. This system provides a resource management layer, built on the top of Java/RMI, that supports the execution of the data mining tool called Weka. We evaluate the performance of Inhambu by means of several experiments in homogeneous, heterogeneous and non-dedicated clusters. The obtained results are compared with those achieved by a similar system named Weka-Parallel. Inhambu outperforms its counterpart for coarse grain applications, mainly for heterogeneous and non-dedicated clusters. Also, our system provides additional advantages such as application checkpointing, support for dynamic aggregation of hosts to the cluster, automatic restarting of failed tasks, and a more effective usage of the cluster. Therefore, Inhambu is a promising tool for efficiently executing real-world data mining applications. The software is delivered at the project's web site available at http://incubadora.fapesp.br/projects/inhambu/.