Original article: Learning over sets with Recurrent Neural Networks: An empirical categorization of aggregation functions

Authors:
W. Heidl;C. Eitzinger;M. Gyimesi;F. Breitenecker
Affiliations:
Profactor GmbH, Im Stadtgut A2, 4407 Steyr-Gleink, Austria;Profactor GmbH, Im Stadtgut A2, 4407 Steyr-Gleink, Austria;Vienna University of Technology, Wiedner Hauptstraíe 8-10, 1040 Wien, Austria;Vienna University of Technology, Wiedner Hauptstraíe 8-10, 1040 Wien, Austria
Venue:
Mathematics and Computers in Simulation
Year:
2011

Citing 10
Cited 0

A framework for the cooperation of learning algorithms

NIPS-3 Proceedings of the 1990 conference on Advances in neural information processing systems 3
Long short-term memory

Neural Computation
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond

Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
Learning over sets using kernel principal angles

The Journal of Machine Learning Research
Distinctive Image Features from Scale-Invariant Keypoints

International Journal of Computer Vision
The Pyramid Match Kernel: Efficient Learning with Sets of Features

The Journal of Machine Learning Research
A learning algorithm for continually running fully recurrent neural networks

Neural Computation
Classifying relational data with neural networks

ILP'05 Proceedings of the 15th international conference on Inductive Logic Programming
Recurrent networks for structured data - A unifying approach and its properties

Cognitive Systems Research
A general framework for adaptive processing of data structures

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

Numerous applications benefit from parts-based representations resulting in sets of feature vectors. To apply standard machine learning methods, these sets of varying cardinality need to be aggregated into a single fixed-length vector. We have evaluated three common Recurrent Neural Network (RNN) architectures, Elman, Williams & Zipser and Long Short Term Memory networks, on approximating eight aggregation functions of varying complexity. The goal is to establish baseline results showing whether existing RNNs can be applied to learn order invariant aggregation functions. The results indicate that the aggregation functions can be categorized according to whether they entail (a) selection of a subset of elements and/or (b) non-linear operations on the elements. We have found that RNNs can very well learn to approximate aggregation functions requiring either (a) or (b) and those requiring only linear sub functions with very high accuracy. However, the combination of (a) and (b) cannot be learned adequately by these RNN architectures, regardless of size and architecture.