A framework for the cooperation of learning algorithms
NIPS-3 Proceedings of the 1990 conference on Advances in neural information processing systems 3
Neural Computation
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
Learning over sets using kernel principal angles
The Journal of Machine Learning Research
Distinctive Image Features from Scale-Invariant Keypoints
International Journal of Computer Vision
The Pyramid Match Kernel: Efficient Learning with Sets of Features
The Journal of Machine Learning Research
A learning algorithm for continually running fully recurrent neural networks
Neural Computation
Classifying relational data with neural networks
ILP'05 Proceedings of the 15th international conference on Inductive Logic Programming
Recurrent networks for structured data - A unifying approach and its properties
Cognitive Systems Research
A general framework for adaptive processing of data structures
IEEE Transactions on Neural Networks
Hi-index | 0.00 |
Numerous applications benefit from parts-based representations resulting in sets of feature vectors. To apply standard machine learning methods, these sets of varying cardinality need to be aggregated into a single fixed-length vector. We have evaluated three common Recurrent Neural Network (RNN) architectures, Elman, Williams & Zipser and Long Short Term Memory networks, on approximating eight aggregation functions of varying complexity. The goal is to establish baseline results showing whether existing RNNs can be applied to learn order invariant aggregation functions. The results indicate that the aggregation functions can be categorized according to whether they entail (a) selection of a subset of elements and/or (b) non-linear operations on the elements. We have found that RNNs can very well learn to approximate aggregation functions requiring either (a) or (b) and those requiring only linear sub functions with very high accuracy. However, the combination of (a) and (b) cannot be learned adequately by these RNN architectures, regardless of size and architecture.