Clustering with Apache Hadoop

Authors:
S. Nair;J. Mehta
Affiliations:
Shah And Anchor Kutchhi Engineering College, Chembur, Mumbai, India;Shah and Anchor Kutchhi Engineering College, Chembur, Mumbai, India
Venue:
Proceedings of the International Conference & Workshop on Emerging Trends in Technology
Year:
2011

Citing 4
Cited 0

Data mining: concepts and techniques

Data mining: concepts and techniques
Clustering of the self-organizing map

IEEE Transactions on Neural Networks
Bankruptcy analysis with self-organizing maps in learning metrics

IEEE Transactions on Neural Networks
Generalizing self-organizing map for categorical data

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

The self-organizing map (SOM) is an unsupervised neural network which projects high-dimensional data onto a low-dimensional grid and visually reveals the topological order of the original data. Thus, SOM is an excellent tool in the exploratory phase of data mining. Self-organizing maps have been successfully applied to many fields, including engineering and business domains. Experimental results on census database illustrate the results of clustering. The paper proposes to improve the performance of clustering by the latest approach of cloud computing. The approach focuses on Hadoop that provides a Java-based software framework to distribute processing over a cluster of processors by providing a open source implementation of MapReduce, a powerful tool designed for the detailed analysis and transformation of very large data sets.