Fundamentals of speech recognition
Fundamentals of speech recognition
Time and frequency filtering of filter-bank energies for robust HMM speech recognition
Speech Communication - Special issue on noise robust ASR
SoundButton: Design of a Low Power Wearable Audio Classification System
ISWC '03 Proceedings of the 7th IEEE International Symposium on Wearable Computers
Computational Auditory Scene Analysis: Principles, Algorithms, and Applications
Computational Auditory Scene Analysis: Principles, Algorithms, and Applications
Classification of acoustic events using SVM-based clustering schemes
Pattern Recognition
CLEAR evaluation of acoustic event detection and classification systems
CLEAR'06 Proceedings of the 1st international evaluation conference on Classification of events, activities and relationships
Activity recognition using a spectral entropy signature
Proceedings of the 2012 ACM Conference on Ubiquitous Computing
Supervised acoustic concept extraction for multimedia event detection
Proceedings of the 2012 ACM international workshop on Audio and multimedia methods for large-scale video analysis
Hi-index | 0.10 |
Acoustic event detection (AED) aims at determining the identity of sounds and their temporal position in the signals that are captured by one or several microphones. The AED problem has been recently proposed for meeting-room or class-room environments, where a specific set of meaningful sounds has been defined, and several evaluations have been carried out (within the international CLEAR evaluation campaigns). This paper reports some work in AED done by the authors in that framework, and particularly presents the extension to the difficult problem of detecting overlapped sounds. Actually, temporal overlaps accounted for more than 70% of errors in the real-world interactive seminar recordings used in CLEAR 2007 evaluations. An attempt to deal with that problem at the level of models using our SVM-based AED system is reported in the paper. The proposed two-step system noticeably outperforms the baseline system for both an artificially generated database and a real seminar recording database. The databases and metrics developed for the CLEAR 2007 evaluations are also described. Finally, a real-time AED system implemented in the UPC's smart-room using several microphones is reported, along with a GUI-based demo that includes also the output of an acoustic source localization system.