Multiresolution pitch analysis of talking, singing, and the continuum between

Authors:
David Gerhard
Affiliations:
Department of Computer Science, University of Regina, Regina, Saskatchewan, Canada
Venue:
RSFDGrC'05 Proceedings of the 10th international conference on Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing - Volume Part II
Year:
2005

Citing 2
Cited 0

Discrete-time signal processing (2nd ed.)

Discrete-time signal processing (2nd ed.)
Content-Based Classification, Search, and Retrieval of Audio

IEEE MultiMedia

Quantified Score

Hi-index	0.00

Visualization

Abstract

Talking and singing seem disparate, but there are a range of human utterances that fall between them, such as poetry, chanting, and rap music. This paper presents research into differentiation between talking and singing, development of feature-based analysis tools to explore the continuum between talking and singing, and evaluating human perception of this continuum as compared to these analysis tools. Preliminary background is presented to acquaint the reader with some of the science used in the algorithm development. A corpus of sounds was collected to study the differences between singing and talking, and the procedures and results of this collection are presented. A set of features is developed to differentiate between talking and singing, and to investigate the intermediate vocalizations between talking and singing. The results of these features are examined and evaluated. The perception of speech is heavily influenced by the pitch, which in the english language carries no lexicographic information but can carry higher-level semiotic information and can contribute to disambiguation.