A digital library data model for music
Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries
Singing voice detection in popular music
Proceedings of the 12th annual ACM international conference on Multimedia
Multimodal content-based structure analysis of karaoke music
Proceedings of the 13th annual ACM international conference on Multimedia
Key, Chord, and Rhythm Tracking of Popular Music Recordings
Computer Music Journal
Syllabic level automatic synchronization of music signals and text lyrics
MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
A voice-to-MIDI system for singing melodies with lyrics
Proceedings of the international conference on Advances in computer entertainment technology
SlideSeer: a digital library of aligned document and presentation pairs
Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
Enriching music with synchronized lyrics, images and colored lights
Proceedings of the 1st international conference on Ambient media and systems
A cross-modal approach for karaoke artifacts correction
Multimedia Tools and Applications
Refinement Strategies for Music Synchronization
Computer Music Modeling and Retrieval. Genesis of Meaning in Sound and Music
Effectiveness of signal segmentation for music content representation
MMM'08 Proceedings of the 14th international conference on Advances in multimedia modeling
Word level automatic alignment of music and lyrics using vocal synthesis
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Towards reliable partial music alignments using multiple synchronization strategies
AMR'09 Proceedings of the 7th international conference on Adaptive multimedia retrieval: understanding media and adapting to the user
A musical source separation system with lyrics alignment
ICS'06 Proceedings of the 10th WSEAS international conference on Systems
Online music search by tapping
Ambient Intelligence in Everyday Life
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Lyrics-based audio retrieval and multimodal navigation in music collections
ECDL'07 Proceedings of the 11th European conference on Research and Advanced Technology for Digital Libraries
Hi-index | 0.00 |
We present a prototype that automatically aligns acoustic musical signals with their corresponding textual lyrics, in a manner similar to manually-aligned karaoke. We tackle this problem using a multimodal approach, where the appropriate pairing of audio and text processing helps create a more accurate system. Our audio processing technique uses a combination of top-down and bottom-up approaches, combining the strength of low-level audio features and high-level musical knowledge to determine the hierarchical rhythm structure, singing voice and chorus sections in the musical audio. Text processing is also employed to approximate the length of the sung passages using the textual lyrics. Results show an average error of less than one bar for per-line alignment of the lyrics on a test bed of 20 songs (sampled from CD audio and carefully selected for variety). We perform holistic and per-component testing and analysis and outline steps for further development.