Hamming Distance based Clustering Algorithm

  • Authors:
  • Ritu Vijay;Prerna Mahajan;Rekha Kandwal

  • Affiliations:
  • Bansthali University, India;Prerna Mahajan, Research Scholar, Banasthali University, India;Ministry of Earth Sciences & Science and Technology, India

  • Venue:
  • International Journal of Information Retrieval Research
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Cluster analysis has been extensively used in machine learning and data mining to discover distribution patterns in the data. Clustering algorithms are generally based on a distance metric in order to partition the data into small groups such that data instances in the same group are more similar than the instances belonging to different groups. In this paper the authors have extended the concept of hamming distance for categorical data.As a data processing step they have transformed the data into binary representation. The authors have used proposed algorithm to group data points into clusters. The experiments are carried out on the data sets from UCI machine learning repository to analyze the performance study. They conclude by stating that this proposed algorithm shows promising result and can be extended to handle numeric as well as mixed data.