Learning from demonstration in robots: Experimental comparison of neural architectures

  • Authors:
  • Muhammad Umar Suleman;Mian M. Awais

  • Affiliations:
  • Department of Computer Science, School of Science and Engineering, Lahore University of Management Sciences (LUMS), Opposite Sector U, D.H.A, Lahore Cantt, 54792, Lahore, Pakistan;Department of Computer Science, School of Science and Engineering, Lahore University of Management Sciences (LUMS), Opposite Sector U, D.H.A, Lahore Cantt, 54792, Lahore, Pakistan

  • Venue:
  • Robotics and Computer-Integrated Manufacturing
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Robots have played an important role in the automation of computer aided manufacturing. The classical robot control implementation involves an expensive key step of model-based programming. An intuitive way to reduce this expensive exercise is to replace programming with machine learning of robot actions from demonstration where a (learner) robot learns an action by observing a demonstrator robot performing the same. To achieve this learning from demonstration (LFD) different machine learning techniques such as Artificial Neural Networks (ANN), Genetic Algorithms, Hidden Markov Models, Support Vector Machines, etc. can be used. This piece of work focuses exclusively on ANNs. Since ANNs have many standard architectural variations divided into two basic computational categories namely the recurrent networks and feed-forward networks, representative networks from each have been selected for study, i.e. Feed Forward Multilayer Perceptron (FF) network for feed-forward networks category and Elman (EL), and Nonlinear Autoregressive Exogenous Model (NARX) networks for the recurrent networks category. The main objective of this work is to identify the most suitable neural architecture for application of LFD in learning different robot actions. The sensor and actuator streams of demonstrated action are used as training data for ANN learning. Consequently, the learning capability is measured by comparing the error between demonstrator and corresponding learner streams. To achieve fairness in comparison three steps have been taken. First, Dynamic Time Warping is used to measure the error between demonstrator and learner streams, which gives resilience against translation in time. Second, comparison statistics are drawn between the best, instead of weight-equal, configurations of competing architectures so that learning capability of any architecture is not forced handicap. Third, each configuration's error is calculated as the average of ten trials of all possible learning sequences with random weight initialization so that the error value is independent of a particular sequence of learning or a particular set of initial weights. Six experiments are conducted to get a performance pattern of each architecture. In each experiment, a total of nine different robot actions were tested. Error statistics thus obtained have shown that NARX architecture is most suitable for this learning problem whereas Elman architecture has shown the worst suitability. Interestingly the computationally lesser MLP gives much lower and slightly higher error statistics compared to the computationally superior Elman and NARX neural architectures, respectively.