Modified Gath--Geva clustering for fuzzy segmentation of multivariate time-series

  • Authors:
  • Janos Abonyi;Balazs Feil;Sandor Nemeth;Peter Arva

  • Affiliations:
  • Department of Process Engineering, University of Veszprem, P.O. Box 158, Veszprem H-8201, Hungary;Department of Process Engineering, University of Veszprem, P.O. Box 158, Veszprem H-8201, Hungary;Department of Process Engineering, University of Veszprem, P.O. Box 158, Veszprem H-8201, Hungary;Department of Process Engineering, University of Veszprem, P.O. Box 158, Veszprem H-8201, Hungary

  • Venue:
  • Fuzzy Sets and Systems
  • Year:
  • 2005

Quantified Score

Hi-index 0.22

Visualization

Abstract

Partitioning a time-series into internally homogeneous segments is an important data-mining problem. The changes of the variables of a multivariate time-series are usually vague and do not focus on any particular time point. Therefore, it is not practical to define crisp bounds of the segments. Although fuzzy clustering algorithms are widely used to group overlapping and vague objects, they cannot be directly applied to time-series segmentation, because the clusters need to be contiguous in time. This paper proposes a clustering algorithm for the simultaneous identification of local probabilistic principal component analysis (PPCA) models used to measure the homogeneity of the segments and fuzzy sets used to represent the segments in time. The algorithm favors contiguous clusters in time and is able to detect changes in the hidden structure of multivariate time-series. A fuzzy decision making algorithm based on a compatibility criteria of the clusters has been worked out to determine the required number of segments, while the required number of principal components are determined by the screeplots of the eigenvalues of the fuzzy covariance matrices. The application example shows that this new technique is a useful tool for the analysis of historical process data.