Regression Clustering

Authors:
Bin Zhang
Affiliations:
-
Venue:
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Year:
2003

Citing 5
Cited 0

Trajectory clustering with mixtures of regression models

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
The Cluster Dissection and Analysis Theory FORTRAN Programs Examples

The Cluster Dissection and Analysis Theory FORTRAN Programs Examples
Clustered Partial Linear Regression

Machine Learning
Introduction to Linear Regression Analysis, Solutions Manual (Wiley Series in Probability and Statistics)

Introduction to Linear Regression Analysis, Solutions Manual (Wiley Series in Probability and Statistics)
Comparison of the performance of center-based clustering algorithms

PAKDD'03 Proceedings of the 7th Pacific-Asia conference on Advances in knowledge discovery and data mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Complex distribution in real-world data is oftenmodeled by a mixture of simpler distributions. Clusteringis one of the tools to reveal the structure of this mixture.The same is true to the datasets with chosen responsevariables that people run regression on. Withoutseparating the clusters with very different responseproperties, the residue error of the regression is large.Input variable selection could also be misguided to ahigher complexity by the mixture. In RegressionClustering (RC), K (1) regression functions are appliedto the dataset simultaneously which guide the clusteringof the dataset into K subsets each with a simplerdistribution matching its guiding function. Each functionis regressed on its own subset of data with a muchsmaller residue error. Both the regressions and theclustering optimize a common objective function. Wepresent a RC algorithm based on K-Harmonic Meansclustering algorithm and compare it with other existingRC algorithms based on K-Means and EM.