Comparison of different implementations of MFCC
Journal of Computer Science and Technology
Some topics in analysis of boolean functions
STOC '08 Proceedings of the fortieth annual ACM symposium on Theory of computing
Proceedings of the 6th ACM conference on Embedded network sensor systems
SoundSense: scalable sound sensing for people-centric applications on mobile phones
Proceedings of the 7th international conference on Mobile systems, applications, and services
Fast approximate correlation for massive time-series data
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
2nd international workshop on intelligent user interfaces for developing regions: IUI4DR
Proceedings of the 16th international conference on Intelligent user interfaces
SpeakerSense: energy efficient unobtrusive speaker identification on mobile phones
Pervasive'11 Proceedings of the 9th international conference on Pervasive computing
Simple and practical algorithm for sparse Fourier transform
Proceedings of the twenty-third annual ACM-SIAM symposium on Discrete Algorithms
Nearly optimal sparse fourier transform
STOC '12 Proceedings of the forty-fourth annual ACM symposium on Theory of computing
IEEE Transactions on Information Theory
MusicalHeart: a hearty way of listening to music
Proceedings of the 10th ACM Conference on Embedded Network Sensor Systems
The 14th international workshop on mobile computing systems and applications (ACM HotMobile 2013)
ACM SIGMOBILE Mobile Computing and Communications Review
Hi-index | 0.00 |
Due to limited processing capability, contemporary smartphones cannot extract frequency domain acoustic features in real-time on the device when the sampling rate is high. We propose a solution to this problem which exploits the sparseness in speech to extract frequency domain acoustic features inside a smartphone in real-time, without requiring any support from a remote server even when the sampling rate is as high as 44.1 KHz. We perform an empirical study to quantify the sparseness in speech recorded on a smartphone and use it to obtain a highly accurate and sparse approximation of a widely used feature of speech called the Mel-Frequency Cepstral Coefficients (MFCC) efficiently. We name the new feature the sparse MFCC or sMFCC, in short. We experimentally determine the trade-offs between the approximation error and the expected speedup of sMFCC. We implement a simple spoken word recognition application using both MFCC and sMFCC features, show that sMFCC is expected to be up to 5.84 times faster and its accuracy is within 1.1% -- 3.9% of that of MFCC, and determine the conditions under which sMFCC runs in real-time.