Fundamentals of speech recognition
Fundamentals of speech recognition
LyricAlly: automatic synchronization of acoustic musical signals and textual lyrics
Proceedings of the 12th annual ACM international conference on Multimedia
Music structure based vector space retrieval
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Syllabic level automatic synchronization of music signals and text lyrics
MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
ISM '06 Proceedings of the Eighth IEEE International Symposium on Multimedia
Hi-index | 0.00 |
We propose a signal-based approach instead of the commonly used model-based approach, to automatically align vocal music with text lyrics at the word level. In this approach, we use a text-to-speech system to synthesize the singing voice according to the lyrics. In this way, aligning the music signal with the corresponding text lyrics becomes the alignment of two audio signals. This study uses the results of music information modeling and singing voice synthesis. In music information modeling, we study different music representation strategies for music segmentation, music region indexing and region content descriptions; in singing voice synthesis, we generate singing voice by making use of music knowledge to approximate the target vocal line in terms of tempo. The experimental results on a 20-song database show 26.3% and 36.1% word level alignment error rates at eighth note and sixteenth note alignment tolerances respectively. The proposed approach presents an alternative and effective solution to music-lyrics alignment which may require less training dataset.