Original article: Learning over sets with Recurrent Neural Networks: An empirical categorization of aggregation functions

  • Authors:
  • W. Heidl;C. Eitzinger;M. Gyimesi;F. Breitenecker

  • Affiliations:
  • Profactor GmbH, Im Stadtgut A2, 4407 Steyr-Gleink, Austria;Profactor GmbH, Im Stadtgut A2, 4407 Steyr-Gleink, Austria;Vienna University of Technology, Wiedner Hauptstraíe 8-10, 1040 Wien, Austria;Vienna University of Technology, Wiedner Hauptstraíe 8-10, 1040 Wien, Austria

  • Venue:
  • Mathematics and Computers in Simulation
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Numerous applications benefit from parts-based representations resulting in sets of feature vectors. To apply standard machine learning methods, these sets of varying cardinality need to be aggregated into a single fixed-length vector. We have evaluated three common Recurrent Neural Network (RNN) architectures, Elman, Williams & Zipser and Long Short Term Memory networks, on approximating eight aggregation functions of varying complexity. The goal is to establish baseline results showing whether existing RNNs can be applied to learn order invariant aggregation functions. The results indicate that the aggregation functions can be categorized according to whether they entail (a) selection of a subset of elements and/or (b) non-linear operations on the elements. We have found that RNNs can very well learn to approximate aggregation functions requiring either (a) or (b) and those requiring only linear sub functions with very high accuracy. However, the combination of (a) and (b) cannot be learned adequately by these RNN architectures, regardless of size and architecture.