Melody Transcription From Music Audio: Approaches and Evaluation

Authors:
G. E. Poliner;D. P.W. Ellis;A. F. Ehmann;E. Gomez;S. Streich;Beesuan Ong
Affiliations:
Dept. of Electr. Eng., Columbia Univ., New York, NY;-;-;-;-;-
Venue:
IEEE Transactions on Audio, Speech, and Language Processing
Year:
2007

Citing 0
Cited 12

Automatic transcription of melody, bass line, and chords in polyphonic music

Computer Music Journal
Note separation of polyphonic music by energy split

ISPRA'08 Proceedings of the 7th WSEAS International Conference on Signal Processing, Robotics and Automation
A Drum Machine That Learns to Groove

KI '08 Proceedings of the 31st annual German conference on Advances in Artificial Intelligence
Polyphonic music separation based on the simplified energy splitter

WSEAS Transactions on Signal Processing
Musical style, psychoaesthetics, and prospects for entropy as an analytic tool

Computer Music Journal
Groovy Neural Networks

Proceedings of the 2008 conference on ECAI 2008: 18th European Conference on Artificial Intelligence
Source/filter model for unsupervised main melody extraction from polyphonic audio signals

IEEE Transactions on Audio, Speech, and Language Processing
Vocal melody extraction in the presence of pitched accompaniment in polyphonic music

IEEE Transactions on Audio, Speech, and Language Processing
Pattern induction and matching in music signals

CMMR'10 Proceedings of the 7th international conference on Exploring music contents
Singing Voice Enhancement in Monaural Music Signals Based on Two-stage Harmonic/Percussive Sound Separation on Multiple Resolution Spectrograms

IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)
Automatic music transcription: challenges and future directions

Journal of Intelligent Information Systems
Evaluation in Music Information Retrieval

Journal of Intelligent Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Although the process of analyzing an audio recording of a music performance is complex and difficult even for a human listener, there are limited forms of information that may be tractably extracted and yet still enable interesting applications. We discuss melody-roughly, the part a listener might whistle or hum-as one such reduced descriptor of music audio, and consider how to define it, and what use it might be. We go on to describe the results of full-scale evaluations of melody transcription systems conducted in 2004 and 2005, including an overview of the systems submitted, details of how the evaluations were conducted, and a discussion of the results. For our definition of melody, current systems can achieve around 70% correct transcription at the frame level, including distinguishing between the presence or absence of the melody. Melodies transcribed at this level are readily recognizable, and show promise for practical applications