ROS open-source audio recognizer: ROAR environmental sound detection tools for robot programming

  • Authors:
  • Joseph M. Romano;Jordan P. Brindza;Katherine J. Kuchenbecker

  • Affiliations:
  • Robotics and Controls Group, Rethink Robotics Inc., Boston, USA;Department of Computer and Information Science, GRASP Laboratory, University of Pennsylvania, Philadelphia, USA;Department of Mechanical Engineering and Applied Mechanics, Haptics Group, GRASP Laboratory, University of Pennsylvania, Philadelphia, USA

  • Venue:
  • Autonomous Robots
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Advances in audio recognition have enabled the real-world success of a wide variety of interactive voice systems over the last two decades. More recently, these same techniques have shown promise in recognizing non-speech audio events. Sounds are ubiquitous in real-world manipulation, such as the click of a button, the crash of an object being knocked over, and the whine of activation from an electric power tool. Surprisingly, very few autonomous robots leverage audio feedback to improve their performance. Modern audio recognition techniques exist that are capable of learning and recognizing real-world sounds, but few implementations exist that are easily incorporated into modern robotic programming frameworks. This paper presents a new software library known as the ROS Open-source Audio Recognizer (ROAR). ROAR provides a complete set of end-to-end tools for online supervised learning of new audio events, feature extraction, automatic one-class Support Vector Machine model tuning, and real-time audio event detection. Through implementation on a Barrett WAM arm, we show that combining the contextual information of the manipulation action with a set of learned audio events yields significant improvements in robotic task-completion rates.