Influence of data dimensionality on the quality of forecasts given by a multilayer perceptron

  • Authors:
  • Krzysztof Michalak;Halina Kwanicka

  • Affiliations:
  • Institute of Applied Informatics, Wrocaw University of Technology, Wyb. Wyspiaskiego 27, 51-370 Wrocaw, Poland;Institute of Applied Informatics, Wrocaw University of Technology, Wyb. Wyspiaskiego 27, 51-370 Wrocaw, Poland

  • Venue:
  • Theoretical Computer Science
  • Year:
  • 2007

Quantified Score

Hi-index 5.23

Visualization

Abstract

One of the phenomena that can be observed when using neural networks for time series prediction is that the quality of the forecasts obtained is correlated with the dimensionality of the data. Higher data dimensionality leads, in most cases, to higher prediction errors. This phenomenon is connected by some authors to the decrease in variance of the distances between the data points, which occurs when the lengths of the predicted vectors increase. In this paper, a proof is given that the variance of the distances between data points also decreases with the so-called correlation dimension of the data. Therefore, a drop in forecast quality might be expected not only when the lengths of the data vectors are increased, but also when using vectors of the same length to represent data of increasing dimensionality. We also present some experimental results that illustrate the dependence between data dimensionality and the variance of the distances between the data points, and the forecast error obtained when using a multilayer perceptron to predict future values of some time series.