Environmental sound recognition by measuring significant changes in the spectral entropy

Authors:
Jessica Beltrán-Márquez;Edgar Chávez;Jesús Favela
Affiliations:
CICESE, Mexico;Universidad Michoacana, Mexico;CICESE, Mexico
Venue:
MCPR'12 Proceedings of the 4th Mexican conference on Pattern Recognition
Year:
2012

Citing 10
Cited 0

Acoustic environment classification

ACM Transactions on Speech and Language Processing (TSLP)
Activity Recognition of Assembly Tasks Using Body-Worn Microphones and Accelerometers

IEEE Transactions on Pattern Analysis and Machine Intelligence
Recognizing context for annotating a live life recording

Personal and Ubiquitous Computing - Memory and Sharing of Experiences
Disambiguating Sounds through Context

ICSC '08 Proceedings of the 2008 IEEE International Conference on Semantic Computing
SoundSense: scalable sound sensing for people-centric applications on mobile phones

Proceedings of the 7th international conference on Mobile systems, applications, and services
Environmental sound recognition with time-frequency audio features

IEEE Transactions on Audio, Speech, and Language Processing
Robust Radio Broadcast Monitoring Using a Multi-Band Spectral Entropy Signature

CIARP '09 Proceedings of the 14th Iberoamerican Conference on Pattern Recognition: Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications
Segmentation, indexing, and retrieval for environmental and natural sounds

IEEE Transactions on Audio, Speech, and Language Processing
Challenges in ubiquitous context recognition with personal mobile devices

Proceedings of the 4th ACM International Workshop on Context-Awareness for Self-Managing Systems
A survey of mobile phone sensing

IEEE Communications Magazine

Quantified Score

Hi-index	0.00

Visualization

Abstract

Automatic identification of activities can be used to provide information to caregivers of persons with dementia for identifying assistance needs. Environmental audio provides significant and representative information of the context, making microphones a choice to identify activities automatically. However, in real situations, the audio captured by microphones comes from overlapping sound sources, making its identification a challenge for audio analysis and retrieval. In this paper we propose a succinct representation of the signal by measuring the multiband spectral entropy of the signal frame by frame, followed by a cosine transform and binary codification, we call this the Cosine Multi-Band Spectral Entropy Signature (CMBSES). To test our proposal, we created a database of a mix-up of triples from a collection of nine environmental sounds in four different signal-to-noise ratios (SNR). We codified both the original sounds and the triples and then searched all the original sounds in the mix-up collection. To establish a ground truth we also tested the same database with 48 people of assorted ages. Our feature extraction outperforms the state-of-the-art Mel Frequency Cepstral Coefficients (MFCC) and it also surpass humans in the experiment.