Quantified Score

Hi-index 0.01

Visualization

Abstract

Compositional data, containing relative information, occur regularly in many disciplines and practical situations. Multivariate statistics methods including regression analysis have been adopted to model compositional data, but the existing research is still scattered and fragmented. This paper contributes to modeling the linear regression relationship for compositional data as both dependent and independent variables. First, some operations in Simplex space, such as the perturbation operation, the power transformation, and the inner product, are defined for compositional-data vectors. The regression models are then built by the original compositional data and transformed data, respectively, after the introduction of the Isometric Logratio Transformation (ilr). By theoretical inference, it turns out that the two models are equivalent in essence using the ordinary least squares (OLS) method. Two measures for testing goodness of fit, i.e., the observed squared correlation coefficient R^2 and the cross validated squared correlation coefficient Q^2, are also proposed to evaluate the regression models. Besides, the estimated regression parameters are explained to indicate the notion of relative elasticity. An empirical analysis finally illustrates the usefulness of the multiple linear regression models for compositional-data variables.