Matrix comparison, Part 1: Motivation and important issues for measuring the resemblance between proximity measures or ordination results

Authors:
Jesper W. Schneider;Pia Borlund
Affiliations:
Department of Information Studies, Royal School of Library and Information Science, Sohngaardsholmsvej 2, 9000 Aalborg, Denmark;Department of Information Studies, Royal School of Library and Information Science, Sohngaardsholmsvej 2, 9000 Aalborg, Denmark
Venue:
Journal of the American Society for Information Science and Technology
Year:
2007

Citing 0
Cited 8

On the normalization and visualization of author co-citation data: Salton's Cosine versus the Jaccard index

Journal of the American Society for Information Science and Technology
Appropriate similarity measures for author co-citation analysis

Journal of the American Society for Information Science and Technology
Distance metrics for high dimensional nearest neighborhood recovery: Compression and normalization

Information Sciences: an International Journal
Experimental comparison of first and second-order similarities in a scientometric context

Scientometrics
Topological comparisons of proximity measures

PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
Incremental learning of patch-based bag of facial words representation for online face recognition in videos

PCM'12 Proceedings of the 13th Pacific-Rim conference on Advances in Multimedia Information Processing
Combining social information for academic networking

Proceedings of the 2013 conference on Computer supported cooperative work
Keep it simple and sparse: real-time action recognition

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

The present two-part article introduces matrix com-parison as a formal means of evaluation in informetric studies such as cocitation analysis. In this first part, the motivation behind introducing matrix comparison to informetric studies, as well as two important issues influencing such comparisons, are introduced and discussed. The motivation is spurred by the recent debate on choice of proximity measures and their potential influence upon clustering and ordination results. The two important issues discussed here are matrix generation and the composition of proximity measures. The approach to matrix generation is demonstrated for the same data set, i.e., how data is represented and transformed in a matrix, evidently determines the behavior of proximity measures. Two different matrix generation approaches, in all probability, will lead to different proximity rankings of objects, which further lead to different ordination and clustering results for the same set of objects. Further, a resemblance in the composition of formulas indicates whether two proximity measures may produce similar ordination and clustering results. However, as shown in the case of the angular correlation and cosine measures, a small deviation in otherwise similar formulas can lead to different rankings depending on the contour of the data matrix transformed. Eventually, the behavior of proximity measures, that is whether they produce similar rankings of objects, is more or less data-specific. Consequently, the authors recommend the use of empirical matrix comparison techniques for individual studies to investigate the degree of resemblance between proximity measures or their ordination results. In part two of the article, the authors introduce and demonstrate two related statistical matrix comparison techniques the Mantel test and Procrustes analysis, respectively. These techniques can compare and evaluate the degree of monotonicity between different proximity measures or their ordination results. As such, the Mantel test and Procrustes analysis can be used as statistical validation tools in informetric studies and thus help choosing suitable proximity measures. © 2007 Wiley Periodicals, Inc.