Determination of nonprototypical valence and arousal in popular music: features and performances

Authors:
Björn Schuller;Johannes Dorfner;Gerhard Rigoll
Affiliations:
Institute for Human-Machine Communication, Technische Universität München, München, Germany;Institute for Human-Machine Communication, Technische Universität München, München, Germany;Institute for Human-Machine Communication, Technische Universität München, München, Germany
Venue:
EURASIP Journal on Audio, Speech, and Music Processing - Special issue on scalable audio-content analysis
Year:
2010

Citing 20
Cited 2

Assessing agreement on classification tasks: the kappa statistic

Computational Linguistics
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

ECML '98 Proceedings of the 10th European Conference on Machine Learning
Popular music retrieval by detecting mood

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
MARSYAS: a framework for audio analysis

Organised Sound
MARSYAS: a framework for audio analysis

Organised Sound
ConceptNet — A Practical Commonsense Reasoning Tool-Kit

BT Technology Journal
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Towards structural analysis of audio recordings in the presence of musical variations

EURASIP Journal on Applied Signal Processing
Information Retrieval for Music and Motion

Information Retrieval for Music and Motion
Tango or Waltz?: putting ballroom dance style into tempo detection

EURASIP Journal on Audio, Speech, and Music Processing - Intelligent Audio, Speech, and Music Processing Applications
A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions

IEEE Transactions on Pattern Analysis and Machine Intelligence
Emotion recognition from speech: Putting ASR in the loop

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Being bored? Recognising natural interest by extensive audiovisual integration for real-life application

Image and Vision Computing
"The Godfather vs. "Chaos: Comparing Linguistic Analysis Based on On-line Knowledge Sources and Bags-of-N-Grams for Movie Review Valence Estimation

ICDAR '09 Proceedings of the 2009 10th International Conference on Document Analysis and Recognition
Audio chord labeling by musiological modeling and beat-synchronization

ICME'09 Proceedings of the 2009 IEEE international conference on Multimedia and Expo
On the impact of children's emotional speech on acoustic and language models

EURASIP Journal on Audio, Speech, and Music Processing - Special issue on atypical speech
Filter bank tree and M-band wavelet packet algorithms in audiosignal processing

IEEE Transactions on Signal Processing
Acoustic Chord Transcription and Key Extraction From Audio Using Key-Dependent HMMs Trained on Synthesized Audio

IEEE Transactions on Audio, Speech, and Language Processing
Automatic mood detection and tracking of music audio signals

IEEE Transactions on Audio, Speech, and Language Processing
A chorus section detection method for musical audio signals and its application to a music listening station

IEEE Transactions on Audio, Speech, and Language Processing

Affective classification in video based on semi-supervised learning

ISNN'11 Proceedings of the 8th international conference on Advances in neural networks - Volume Part III
Machine Recognition of Music Emotion: A Review

ACM Transactions on Intelligent Systems and Technology (TIST)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Mood of Music is among the most relevant and commercially promising, yet challenging attributes for retrieval in large music collections. In this respect this article first provides a short overview on methods and performances in the field. While most past research so far dealt with low-level audio descriptors to this aim, this article reports on results exploiting information on middle-level as the rhythmic and chordal structure or lyrics of a musical piece. Special attention is given to realism and nonprototypicality of the selected songs in the database: all feature information is obtained by fully automatic preclassification apart from the lyrics which are automatically retrieved from on-line sources. Further more, instead of exclusively picking songs with agreement of several annotators upon perceived mood, a full collection of 69 double CDs, or 2 648 titles, respectively, is processed. Due to the severity of this task; different modelling forms in the arousal and valence space are investigated, and relevance per feature group is reported.