An ontological framework for retrieving environmental sounds using semantics and acoustic content

Authors:
Gordon Wichern;Brandon Mechtley;Alex Fink;Harvey Thornburg;Andreas Spanias
Affiliations:
Arts, Media, and Engineering and Electrical Engineering Departments, Arizona State University, Tempe, AZ;Arts, Media, and Engineering and Electrical Engineering Departments, Arizona State University, Tempe, AZ;Arts, Media, and Engineering and Electrical Engineering Departments, Arizona State University, Tempe, AZ;Arts, Media, and Engineering and Electrical Engineering Departments, Arizona State University, Tempe, AZ;Arts, Media, and Engineering and Electrical Engineering Departments, Arizona State University, Tempe, AZ
Venue:
EURASIP Journal on Audio, Speech, and Music Processing - Special issue on environmental sound synthesis, processing, and retrieval
Year:
2010

Citing 9
Cited 0

Information Retrieval

Information Retrieval
Introduction to Algorithms

Introduction to Algorithms
A Semantic Web ontology for context-based classification and retrieval of music resources

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
The Google Similarity Distance

IEEE Transactions on Knowledge and Data Engineering
Learning dissimilarities by ranking: from SDP to QP

Proceedings of the 25th international conference on Machine learning
Large-scale content-based audio retrieval from text queries

MIR '08 Proceedings of the 1st ACM international conference on Multimedia information retrieval
WordNet: similarity - measuring the relatedness of concepts

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Segmentation, indexing, and retrieval for environmental and natural sounds

IEEE Transactions on Audio, Speech, and Language Processing
Semantic Annotation and Retrieval of Music and Sound Effects

IEEE Transactions on Audio, Speech, and Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Organizing a database of user-contributed environmental sound recordings allows sound files to be linked not only by the semantic tags and labels applied to them, but also to other sounds with similar acoustic characteristics. Of paramount importance in navigating these databases are the problems of retrieving similar sounds using text- or sound-based queries, and automatically annotating unlabeled sounds. We propose an integrated system, which can be used for text-based retrieval of unlabeled audio, content-based query-by-example, and automatic annotation of unlabeled sound files. To this end, we introduce an ontological framework where sounds are connected to each other based on the similarity between acoustic features specifically adapted to environmental sounds, while semantic tags and sounds are connected through link weights that are optimized based on user-provided tags. Furthermore, tags are linked to each other through a measure of semantic similarity, which allows for efficient incorporation of out-of-vocabulary tags, that is, tags that do not yet exist in the database. Results on two freely available databases of environmental sounds contributed and labeled by nonexpert users demonstrate effective recall, precision, and average precision scores for both the text-based retrieval and annotation tasks.