Heterogeneous data fusion for alzheimer's disease study

  • Authors:
  • Jieping Ye;Kewei Chen;Teresa Wu;Jing Li;Zheng Zhao;Rinkal Patel;Min Bae;Ravi Janardan;Huan Liu;Gene Alexander;Eric Reiman

  • Affiliations:
  • Arizona State University, Tempe, AZ, USA;Banner Good Samaritan Medical Center;Arizona State University, Tempe, AZ, USA;Arizona State University, Tempe, AZ, USA;Arizona State University, Tempe, AZ, USA;Arizona State University, Tempe, AZ, USA;Arizona State University, Tempe, AZ, USA;University of Minnesota;Arizona State University, Tempe, AZ, USA;Banner Good Samaritan Medical Center;Banner Good Samaritan Medical Center

  • Venue:
  • Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Effective diagnosis of Alzheimer's disease (AD) is of primary importance in biomedical research. Recent studies have demonstrated that neuroimaging parameters are sensitive and consistent measures of AD. In addition, genetic and demographic information have also been successfully used for detecting the onset and progression of AD. The research so far has mainly focused on studying one type of data source only. It is expected that the integration of heterogeneous data (neuroimages, demographic, and genetic measures) will improve the prediction accuracy and enhance knowledge discovery from the data, such as the detection of biomarkers. In this paper, we propose to integrate heterogeneous data for AD prediction based on a kernel method. We further extend the kernel framework for selecting features (biomarkers) from heterogeneous data sources. The proposed method is applied to a collection of MRI data from 59 normal healthy controls and 59 AD patients. The MRI data are pre-processed using tensor factorization. In this study, we treat the complementary voxel-based data and region of interest (ROI) data from MRI as two data sources, and attempt to integrate the complementary information by the proposed method. Experimental results show that the integration of multiple data sources leads to a considerable improvement in the prediction accuracy. Results also show that the proposed algorithm identifies biomarkers that play more significant roles than others in AD diagnosis.