Regressor selection with the analysis of variance method

Authors:
Ingela Lind;Lennart Ljung
Affiliations:
Department of Electrical Engineering, Division of Automatic Control, Linköpings Universitet, SE-581 83 Linköping, Sweden.;Department of Electrical Engineering, Division of Automatic Control, Linköpings Universitet, SE-581 83 Linköping, Sweden.
Venue:
Automatica (Journal of IFAC)
Year:
2005

Citing 7
Cited 4

Structure identification of nonlinear dynamic systems—a survey on input/output approaches

Automatica (Journal of IFAC)
Nonlinear black-box modeling in system identification: a unified overview

Automatica (Journal of IFAC) - Special issue on trends in system identification
Radial basis function network configuration using mutual information and the orthogonal least squares algorithm

Neural Networks
System identification (2nd ed.): theory for the user

System identification (2nd ed.): theory for the user
Adaptive modelling, estimation and fusion from data: a neurofuzzy approach

Adaptive modelling, estimation and fusion from data: a neurofuzzy approach
Finding the embedding dimension and variable dependencies in time series

Neural Computation
A generalization of some classical time series tools

Computational Statistics & Data Analysis

Regressor and structure selection in NARX models using a structured ANOVA approach

Automatica (Journal of IFAC)
Representation and identification of non-parametric nonlinear systems of short term memory and low degree of interaction

Automatica (Journal of IFAC)
A two-stage algorithm for identification of nonlinear dynamic systems

Automatica (Journal of IFAC)
Variable selection via RIVAL (removing irrelevant variables amidst Lasso iterations) and its application to nuclear material detection

Automatica (Journal of IFAC)

Quantified Score

Hi-index	22.15

Visualization

Abstract

Identification of non-linear dynamical models of a black box nature involves both structure decisions, i.e., which regressors to use, the selection of a regressor function, and the estimation of the parameters involved. The typical approach in system identification seems to be to mix all these steps, which for example means that the selection of regressors is based on the fits that is achieved for different choices. Alternatively one could then interpret the regressor selection as based on hypothesis tests (F-tests) at a certain confidence level that depends on the data. It would in many cases be desirable to decide which regressors to use independently of the other steps. In this paper we investigate what the well-known method of analysis of variance (ANOVA) can offer for this problem. System identification applications violate many of the ideal conditions for which ANOVA was designed and we study how the method performs under such non-ideal conditions. ANOVA is much faster than a typical parametric estimation method, using e.g. neural networks. It is actually also more reliable, in our tests, in picking the correct structure even under non-ideal conditions. One reason for this may be that ANOVA requires the data set to be balanced, that is, all parts of the regressor space are weighted equally. Just applying tests of fit for the recorded data may give, for structure identification, improper weight to areas with many, or few, samples.