A Clustering Framework Based on Adaptive Space Mapping and Rescaling

  • Authors:
  • Yiling Zeng;Hongbo Xu;Jiafeng Guo;Yu Wang;Shuo Bai

  • Affiliations:
  • Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China 100080;Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China 100080;Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China 100080;Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China 100080;Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China 100080 and Shanghai Stock Exchange, Shanghai, China 200120

  • Venue:
  • AIRS '09 Proceedings of the 5th Asia Information Retrieval Symposium on Information Retrieval Technology
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Traditional clustering algorithms often suffer from model misfit problem when the distribution of real data does not fit the model assumptions. To address this problem, we propose a novel clustering framework based on adaptive space mapping and rescaling, referred as M-R framework. The basic idea of our approach is to adjust the data representation to make the data distribution fit the model assumptions better. Specifically, documents are first mapped into a low dimensional space with respect to the cluster centers so that the distribution statistics of each cluster could be analyzed on the corresponding dimension. With the statistics obtained in hand, a rescaling operation is then applied to regularize the data distribution based on the model assumptions. These two steps are conducted iteratively along with the clustering algorithm to constantly improve the clustering performance. In our work, we apply the M-R framework on the most widely used clustering algorithm, i.e. k-means, as an example. Experiments on well known datasets show that our M-R framework can obtain comparable performance with state-of-the-art methods.