Auditory spectrum-based pitched instrument onset detection

Authors:
Emmanouil Benetos;Yannis Stylianou
Affiliations:
School of Electronic Eng. and Comp. Science, Queen Mary Univ. of London, London, UK and Inst. of Computer Science, Foundation for Res. and Techn.-Hellas, Heraklion, Crete, Greece and the Dept. of ...;Institute of Computer Science, Foundation for Research and Technology-Hellas, Heraklion, Crete, Greece and the Computer Science Department, Multimedia Informatics Lab, University of Crete, Herakli ...
Venue:
IEEE Transactions on Audio, Speech, and Language Processing
Year:
2010

Citing 6
Cited 0

What Size Test Set Gives Good Error Rate Estimates?

IEEE Transactions on Pattern Analysis and Machine Intelligence
Discrete-time signal processing (2nd ed.)

Discrete-time signal processing (2nd ed.)
Sound onset detection by applying psychoacoustic knowledge

ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 06
Discrete-time speech signal processing: principles and practice

Discrete-time speech signal processing: principles and practice
Three dimensions of pitched instrument onset detection

IEEE Transactions on Audio, Speech, and Language Processing
A quantitative assessment of group delay methods for identifying glottal closures in voiced speech

IEEE Transactions on Audio, Speech, and Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, a method for onset detection of music signals using auditory spectra is proposed. The auditory spectrogram provides a time-frequency representation that employs a sound processing model resembling the human auditory system. Recent work on onset detection employs DFT-based features describing spectral energy and phase differences, as well as pitch-based features. These features are often combined for maximizing detection performance. Here, the spectral flux and phase slope features are derived in the auditory framework and a novel fundamental frequency estimation algorithm based on auditory spectra is introduced. An onset detection algorithm is proposed, which processes and combines the aforementioned features at the decision level. Experiments are conducted on a dataset covering 11 pitched instrument types, consisting of 1829 onsets in total. Results indicate that auditory representations outperform various state-of-the-art approaches, with the onset detection algorithm reaching an F-measure of 82.6%.