DECA: A Discrete-Valued Data Clustering Algorithm

Authors:
Andrew K. C. Wong;David C. C. Wang
Affiliations:
MEMBER, IEEE, Department of Systems Design, University of Waterloo, Waterloo, Ont., Canada.;School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213.
Venue:
IEEE Transactions on Pattern Analysis and Machine Intelligence
Year:
1979

Citing 0
Cited 3

Hierarchical clustering of mixed data based on distance hierarchy

Information Sciences: an International Journal
A dissimilarity measure for the k-Modes clustering algorithm

Knowledge-Based Systems
A framework for strategy formulation based on clustering approach: A case study in a corporate organization

Knowledge-Based Systems

Quantified Score

Hi-index	0.14

Visualization

Abstract

This paper presents a new clustering algorithm for analyzing unordered discrete-valued data. This algorithm consists of a cluster initiation phase and a sample regrouping phase. The first phase is based on a data-directed valley detection process utilizing the optimal second-order product approximation of high-order discrete probability distribution, together with a distance measure for discrete-valued data. As for the second phase, it involves the iterative application of the Bayes' decision rule based on subgroup discrete distributions. Since probability is used as its major decision criterion, the proposed method minimizes the disadvantages of yielding solutions sensitive to the arbitrary distance measure adopted. The performance of the proposed algorithm is evaluated by applying it to four different sets of simulated data and a set of clinical data. For performance comparison, the decision-directed algorithm [11] is also applied to the same set of data. These evaluation experiments fully demonstrate the validity and the operational feasibility of the proposed algorithm and its superiority as compared to the decision-directed algorithm.