Acoustic environment classification
ACM Transactions on Speech and Language Processing (TSLP)
Activity Recognition of Assembly Tasks Using Body-Worn Microphones and Accelerometers
IEEE Transactions on Pattern Analysis and Machine Intelligence
Recognizing context for annotating a live life recording
Personal and Ubiquitous Computing - Memory and Sharing of Experiences
Disambiguating Sounds through Context
ICSC '08 Proceedings of the 2008 IEEE International Conference on Semantic Computing
SoundSense: scalable sound sensing for people-centric applications on mobile phones
Proceedings of the 7th international conference on Mobile systems, applications, and services
Environmental sound recognition with time-frequency audio features
IEEE Transactions on Audio, Speech, and Language Processing
Robust Radio Broadcast Monitoring Using a Multi-Band Spectral Entropy Signature
CIARP '09 Proceedings of the 14th Iberoamerican Conference on Pattern Recognition: Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications
Segmentation, indexing, and retrieval for environmental and natural sounds
IEEE Transactions on Audio, Speech, and Language Processing
Challenges in ubiquitous context recognition with personal mobile devices
Proceedings of the 4th ACM International Workshop on Context-Awareness for Self-Managing Systems
A survey of mobile phone sensing
IEEE Communications Magazine
Hi-index | 0.00 |
Automatic identification of activities can be used to provide information to caregivers of persons with dementia for identifying assistance needs. Environmental audio provides significant and representative information of the context, making microphones a choice to identify activities automatically. However, in real situations, the audio captured by microphones comes from overlapping sound sources, making its identification a challenge for audio analysis and retrieval. In this paper we propose a succinct representation of the signal by measuring the multiband spectral entropy of the signal frame by frame, followed by a cosine transform and binary codification, we call this the Cosine Multi-Band Spectral Entropy Signature (CMBSES). To test our proposal, we created a database of a mix-up of triples from a collection of nine environmental sounds in four different signal-to-noise ratios (SNR). We codified both the original sounds and the triples and then searched all the original sounds in the mix-up collection. To establish a ground truth we also tested the same database with 48 people of assorted ages. Our feature extraction outperforms the state-of-the-art Mel Frequency Cepstral Coefficients (MFCC) and it also surpass humans in the experiment.