Fusion and inference from multiple data sources in a commensurate space

Authors:
Zhiliang Ma;David J. Marchette;Carey E. Priebe
Affiliations:
Applied Mathematics & Statistics, Johns Hopkins University, Baltimore, MD, USA;Naval Surface Warfare Center, Dahlgren, VA, USA;Applied Mathematics & Statistics, Johns Hopkins University, Baltimore, MD, USA
Venue:
Statistical Analysis and Data Mining
Year:
2012

Citing 7
Cited 2

Discovering word senses from text

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Concept discovery from text

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
The out-of-sample problem for classical multidimensional scaling

Computational Statistics & Data Analysis
Automatic dimensionality selection from the scree plot via the use of profile likelihood

Computational Statistics & Data Analysis
A new method of feature fusion and its application in image recognition

Pattern Recognition
Dimensionality Reduction on the Cartesian Product of Embeddings of Multiple Dissimilarity Matrices

Journal of Classification
A shape- and texture-based enhanced Fisher classifier for face recognition

IEEE Transactions on Image Processing

Generalized canonical correlation analysis for disparate data fusion

Pattern Recognition Letters
Efficiency investigation of manifold matching for text document classification

Pattern Recognition Letters

Quantified Score

Hi-index	0.00

Visualization

Abstract

Given objects measured under multiple conditions—for example, indoor lighting versus outdoor lighting for face recognition, multiple language translation for document matching, etc.—the challenging task is to perform data fusion and utilize all the available information for inferential purposes. We consider two exploitation tasks: (i) how to determine whether a set of feature vectors represent a single object measured under different conditions; and (ii) how to create a classifier based on training data from one condition in order to classify objects measured under other conditions. The key to both problems is to transform data from multiple conditions into one commensurate space, where the (transformed) feature vectors are comparable and would be treated as if they were collected under the same condition. Toward this end, we studied Procrustes analysis and developed a new approach, which uses the interpoint dissimilarities for each condition. We impute the dissimilarities between measurements of different conditions to create one omnibus dissimilarity matrix, which is then embedded into Euclidean space. We illustrate our methodology on English and French documents collected from Wikipedia, demonstrating superior performance compared to that obtained via standard Procrustes transformation. © 2012 Wiley Periodicals, Inc. Statistical Analysis and Data Mining 5: 187–193, 2012 © 2012 Wiley Periodicals, Inc.