Validating synthetic health datasets for longitudinal clustering

  • Authors:
  • Shima Ghassem Pour;Anthony Maeder;Louisa Jorm

  • Affiliations:
  • University of Western Sydney, Campbelltown, Australia;University of Western Sydney, Campbelltown, Australia;University of Western Sydney, Campbelltown, Australia

  • Venue:
  • HIKM '13 Proceedings of the Sixth Australasian Workshop on Health Informatics and Knowledge Management - Volume 142
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Clustering methods partition datasets into subgroups with some homogeneous properties, with information about the number and particular characteristics of each subgroup unknown a priori. The problem of predicting the number of clusters and quality of each cluster might be overcome by using cluster validation methods. This paper presents such an approach incorporating quantitative methods for comparison between original and synthetic versions of longitudinal health datasets. The use of the methods is demonstrated by using two different clustering algorithms, K-means and Latent Class Analysis, to perform clustering on synthetic data derived from the 45 and Up Study baseline data, from NSW in Australia.