Efficient hold-out for subset of regressors

  • Authors:
  • Tapio Pahikkala;Hanna Suominen;Jorma Boberg;Tapio Salakoski

  • Affiliations:
  • Turku Centre for Computer Science, University of Turku, Department of Information Technology, Turku, Finland;Turku Centre for Computer Science, University of Turku, Department of Information Technology, Turku, Finland;Turku Centre for Computer Science, University of Turku, Department of Information Technology, Turku, Finland;Turku Centre for Computer Science, University of Turku, Department of Information Technology, Turku, Finland

  • Venue:
  • ICANNGA'09 Proceedings of the 9th international conference on Adaptive and natural computing algorithms
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Hold-out and cross-validation are among the most useful methods for model selection and performance assessment of machine learning algorithms. In this paper, we present a computationally efficient algorithm for calculating the hold-out performance for sparse regularized least-squares (RLS) in case the method is already trained with the whole training set. The computational complexity of performing the holdout is O(|H|3 + |H|2n), where |H| is the size of the hold-out set and n is the number of basis vectors. The algorithm can thus be used to calculate various types of cross-validation estimates effectively. For example, when m is the number of training examples, the complexities of N-fold and leave-one-out cross-validations are O(m3/N2 + (m2n)/N) and O(mn), respectively. Further, since sparse RLS can be trained in O(mn2) time for several regularization parameter values in parallel, the fast holdout algorithm enables efficient selection of the optimal parameter value.