Analysis of Symbolic Data: Exploratory Methods for Extracting Statistical Information from Complex Data
Principal component analysis on interval data
Computational Statistics
Symbolic Data Analysis: Conceptual Statistics and Data Mining (Wiley Series in Computational Statistics)
Symbolic Data Analysis and the SODAS Software
Symbolic Data Analysis and the SODAS Software
Geometric Mean for Subspace Selection
IEEE Transactions on Pattern Analysis and Machine Intelligence
Max-Min Distance Analysis by Using Sequential SDP Relaxation for Dimension Reduction
IEEE Transactions on Pattern Analysis and Machine Intelligence
Manifold elastic net: a unified framework for sparse dimension reduction
Data Mining and Knowledge Discovery
Non-Negative Patch Alignment Framework
IEEE Transactions on Neural Networks
DAML: Domain Adaptation Metric Learning
IEEE Transactions on Image Processing
Hi-index | 0.01 |
Principal Component Analysis (PCA) has long been used as a tool in exploratory data analysis and for making predictive models. Recent years have witnessed the continuous emergence of huge-volume data from various computerized industries, which triggers the call for more efficient and effective PCA methods. In light of this, in this paper, we work on interval-valued data and propose a new PCA method called CIPCA. CIPCA discriminates itself from various well-established methods, e.g., VPCA and CPCA, in that it can capture the complete information in interval-valued observations. Taking a hypercube view with infinitely dense points uniformly distributed within the hypercubes, CIPCA defines the inner product of interval-valued variables, and transforms the PCA modeling into the computation of some inner products in the covariance matrix. Both comparative experiments with VPCA and CPCA on the synthetic data sets and applications on real-world data demonstrate the merits of CIPCA in modeling interval-valued data. In particular, CIPCA provides an efficient and effective way for conducting PCA for large-scaled numerical data, and can find the meaningful structure information hidden in massive data.