Factorial k-means analysis for two-way data

Authors:
Maurizio Vichi;Henk A. L. Kiers
Affiliations:
Dipartimento di Statistica Probabilití e Statistiche Applicate, Universití di Roma "La Sapienza", P.le A. Moro 5, I-00185, Italy;Heymans Institute (PA), Grote Kruisstraat 2/1, 9712 TS Groningen, Netherlands
Venue:
Computational Statistics & Data Analysis
Year:
2001

Citing 0
Cited 7

A three-way clusterwise multidimensional unfolding procedure for the spatial representation of context dependent preferences

Computational Statistics & Data Analysis
Clustering and disjoint principal component analysis

Computational Statistics & Data Analysis
A data-mining approach for investigating social and economic geographical dynamics of β-thalassemia's spread

IEEE Transactions on Information Technology in Biomedicine - Special section on computational intelligence in medical systems
Factorial and reduced K-means reconsidered

Computational Statistics & Data Analysis
An optimal cluster-based approach for Subgroup Analysis using Information Complexity Criterion

International Journal of Business Intelligence and Data Mining
Clustering of functional data in a low-dimensional subspace

Advances in Data Analysis and Classification
Cluster Differences Unfolding for Two-Way Two-Mode Preference Rating Data

Journal of Classification

Quantified Score

Hi-index	0.03

Visualization

Abstract

A discrete clustering model together with a continuous factorial one are fitted simultaneously to two-way data, with the aim of identifying the best partition of the objects, described by the best orthogonal linear combinations of the variables (factors) according to the least-squares criterion. This methodology named for its features factorialk-means analysis has a very wide range of applications since it fulfills a double objective: data reduction and synthesis, simultaneously in the direction of objects and variables; variable selection in cluster analysis, identifying variables that most contribute to determine the classification of the objects. The least-squares fitting problem proposed here is mathematically formalized as a quadratic constrained minimization problem with mixed variables. An iterative alternating least-squares algorithm based on two main steps is proposed to solve the quadratic constrained problem. Starting from the cluster centroids, the subspace projection is found that leads to the smallest distances between object points and centroids. Updating the centroids, the partition is detected assigning objects to the closest centroids. At each step the algorithm decreases the least-squares criterion, thus converging to an optimal solution. Two data sets are analyzed to show the features of the factorial k-means model. The proposed technique has a fast algorithm that allows researchers to use it also with large data sets.