Structural and semantic modeling of audio for content-based querying and browsing

Authors:
Mustafa Sert;Buyurman Baykal;Adnan Yazıcı
Affiliations:
Department of Computer Engineering, Başkent University, Ankara, Turkey;Department of Electrical and Electronics Engineering, Middle East Technical University, Ankara, Turkey;Department of Computer Engineering, Middle East Technical University, Ankara, Turkey
Venue:
FQAS'06 Proceedings of the 7th international conference on Flexible Query Answering Systems
Year:
2006

Citing 9
Cited 1

Pause concepts for audio segmentation at different semantic levels

MULTIMEDIA '01 Proceedings of the ninth ACM international conference on Multimedia
A robust audio classification and segmentation method

MULTIMEDIA '01 Proceedings of the ninth ACM international conference on Multimedia
An Approach to a Content-Based Retrieval ofMultimedia Data

Multimedia Tools and Applications
Indexing and Retrieval of Audio: A Survey

Multimedia Tools and Applications
Content-Based Classification, Search, and Retrieval of Audio

IEEE MultiMedia
Structural analysis of musical signals for indexing and thumbnailing

Proceedings of the 3rd ACM/IEEE-CS joint conference on Digital libraries
Music thumbnailing via structural analysis

MULTIMEDIA '03 Proceedings of the eleventh ACM international conference on Multimedia
Repeating pattern discovery and structure analysis from acoustic music data

Proceedings of the 6th ACM SIGMM international workshop on Multimedia information retrieval
Audio thumbnailing of popular music using chroma-based representations

IEEE Transactions on Multimedia

Content-Based Retrieval of Audio in News Broadcasts

FQAS '09 Proceedings of the 8th International Conference on Flexible Query Answering Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

A typical content-based audio management system deals with three aspects namely audio segmentation and classification, audio analysis, and content-based retrieval of audio. In this paper, we integrate the three aspects of content-based audio management into a single framework and propose an efficient method for flexible querying and browsing of auditory data. More specifically, we utilize two robust feature sets namely MPEG-7 Audio Spectrum Flatness (ASF) and Mel Frequency Cepstral Coefficients (MFCC) as the underlying features in order to improve the content-based retrieval accuracy, since both features have some advantages for distinct types of audio (e.g., music and speech). The proposed system provides a wide range of opportunities to query and browse an audio data by content, such as querying and browsing for a chorus section, sound effects, and query-by-example. In addition, the clients can express their queries in the form of point, range, and k-nearest neighbor, which are particularly significant in the multimedia domain.