A discriminative model for polyphonic piano transcription
EURASIP Journal on Applied Signal Processing
Signal Processing
Monaural musical sound separation based on pitch and common amplitude modulation
IEEE Transactions on Audio, Speech, and Language Processing
Hybrid genetic algorithm based on gene fragment competition for polyphonic music transcription
Evo'08 Proceedings of the 2008 conference on Applications of evolutionary computing
Comparative study of extreme learning machine and support vector machine
ISNN'06 Proceedings of the Third international conference on Advances in Neural Networks - Volume Part I
Pitch detection with a neural-net classifier
IEEE Transactions on Signal Processing
Multipitch Analysis of Polyphonic Music and Speech Signals Using an Auditory Model
IEEE Transactions on Audio, Speech, and Language Processing
A connectionist approach to automatic transcription of polyphonic piano music
IEEE Transactions on Multimedia
IEEE Transactions on Information Theory
Hi-index | 12.05 |
Musical pitch estimation is used to find musical note pitch or the fundamental frequency (F0) of audio signal which can be applied to a pre-processing part of many applications such as sound separation, musical note transcription, etc. In this work, a method for the pitch estimation based on classification framework has been designed using a supervised single hidden layer feed-forward neural network. To make this method have good performances in terms of generalization, high-speed training and small network size, two main investigations have been done. First, we find the suitable feature vector by comparing different performances of feature generation methods using extreme learning machine (ELM) framework for training the network. Second, different input-weight fine tuning methods have been compared for reducing the network size. We evaluated the method using multiple-pitch multi-instrument signals generated from datasets of real musical instrument recordings. For feature generation method, the feature vector generated from combining pitch histogram and pitch-frequency scaled spectrum shows the best performance in the experiment. For the fine tuning method, we compare ELM framework with Cuckoo search and sign-based propagation tunings. After the network size is further reduced to 40%, we found that the network trained with sign-based propagation tuning shows a better performance than that trained by ELM framework for the unseen dataset.