WiCOM'09 Proceedings of the 5th International Conference on Wireless communications, networking and mobile computing
Time-scale atoms chains for transients detection in audio signals
IEEE Transactions on Audio, Speech, and Language Processing
MUSIZ: a generic framework for music resizing with stretching and cropping
MM '11 Proceedings of the 19th ACM international conference on Multimedia
Hi-index | 0.00 |
In this paper, we present an algorithm for time-scale modification of music signals, based on the waveform similarity overlap-and-add technique (WSOLA). A well-known disadvantage of the standard WSOLA is the uniform time-scaling of the entire signal, including the perceptually significant transient sections (PSTs), where temporal envelope changes as well as significant spectral transitions occur. Time-scaling of PSTs can severely degrade the music quality. We address this problem by detecting the PSTs and leaving them intact, while time-scaling the remainder of the signal, which is relatively steady-state. In the proposed algorithm, the PSTs are detected using a Mel frequency cepstrum nonstationarity measure and the normalized cross-correlation, with time-varying threshold functions. Our study shows that the accurate detection of PSTs within the WSOLA framework makes it possible to achieve a higher quality of time-scaled music, as confirmed by subjective listening tests.