EROICA: exploring regions of interest with cluster analysis in large functional magnetic resonance imaging data sets

  • Authors:
  • M. Jarmasz;R. L. Somorjai

  • Affiliations:
  • Institute for Biodiagnostics, National Research Council of Canada, 435 Ellice Avenue, Winnipeg, Manitoba R3B 1Y6, Canada;Institute for Biodiagnostics, National Research Council of Canada, 435 Ellice Avenue, Winnipeg, Manitoba R3B 1Y6, Canada

  • Venue:
  • Concepts in Magnetic Resonance: an Educational Journal - Functional magnetic resonance imaging
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

We describe a divide and conquer strategy for an exploratory data analysis (EDA) of large functional magnetic resonance imaging (fMRI) data sets. The need for an EDA to precede and complement a confirmatory model-based analysis is now well established. For complex fMRI experiments, where a prior model of the expected response cannot be posited, the sole option is to conduct an initial EDA. An EDA often discovers unanticipated behavior, allowing the experimenter to augment or even change the original hypothesis. In addition, the gross artifact behavior that EDA makes evident may aid the experimenter in deciding whether the data set is even usable, some additional preprocessing step is required, or the one used has introduced spurious effects. The proposed strategy, named EROICA for exploring regions of interest with cluster analysis, evolved from an empirical observation that a typical cluster of activation or artifact time series can be partitioned into three subsets: time series corrupted by significant trends and time series above and below some noise level. Moreover, the sought after common temporal behavior among the cluster time series can be extracted in an uncorrupted form from the above noise level time series alone. Thus, the key feature of EROICA is the initial partition of the data set into trendy and below the noise level time series, followed by the fuzzy cluster analysis (FCA) of the above the noise level time series to extract common cluster behavior patterns (centroids). The initial partition is based on a test statistic in the power spectrum domain. This step has significant ramifications: it greatly speeds up the FCA because of the much smaller number of time series to cluster; it makes the clustering results more robust because they are no longer affected by the trendy and noisy time series: the above the noise level time series can be further grouped according to the location of the spectral peak on the frequency axis, and these groups can be used to create a subset of initial centroids that greatly improves the convergence rate of the FCA; and the group of below the noise level time series (referred to as the noise pool) can be used as a data-driven representation of the underlying noise source. In the final step, each time series is modeled as a linear combination of the closest centroid plus noise. The noise pool is very convenient for obtaining thresholds when testing the significance of the model parameter without having to model or assume the distributional properties of the underlying noise source. To limit the number of false positives in the activation maps, the significance test also tests the time series power spectrum values at the frequency locations determined by the cluster centroid. EROICA is one of the analysis options offered by the general image-processing package EvIdent ® .