MARSYAS: a framework for audio analysis

Authors:
George Tzanetakis;Perry Cook
Affiliations:
Department of Computer Science, 35 Olden Street, and Department of Music, Princeton University, Princeton, NJ 08544, USA. E-mail: gtzan@cs.princeton.edu, prc@cs.princeton.edu Fax: + 1609-258-1771;Department of Computer Science, 35 Olden Street, and Department of Music, Princeton University, Princeton, NJ 08544, USA. E-mail: gtzan@cs.princeton.edu, prc@cs.princeton.edu Fax: + 1609-258-1771
Venue:
Organised Sound
Year:
1999

Citing 17
Cited 87

New indices for text: PAT Trees and PAT arrays

Information retrieval
On the importance of time—a temporal representation of sound

Visual representations of speech signals
SpeechSkimmer: a system for interactively skimming recorded speech

ACM Transactions on Computer-Human Interaction (TOCHI) - Special issue on speech as data
Algorithms on strings, trees, and sequences: computer science and computational biology

Algorithms on strings, trees, and sequences: computer science and computational biology
Informedia: news-on-demand multimedia information acquisition and retrieval

Intelligent multimedia information retrieval
Statistical methods for speech recognition

Statistical methods for speech recognition
A critique of pure audition

Computational auditory scene analysis
An overview of audio information retrieval

Multimedia Systems - Special issue on audio and multimedia
Suffix arrays: a new method for on-line string searches

SODA '90 Proceedings of the first annual ACM-SIAM symposium on Discrete algorithms
A Space-Economical Suffix Tree Construction Algorithm

Journal of the ACM (JACM)
Statistical Language Learning

Statistical Language Learning
Information Retrieval

Information Retrieval
Content-Based Classification, Search, and Retrieval of Audio

IEEE MultiMedia
Color Set Size Problem with Application to String Matching

CPM '92 Proceedings of the Third Annual Symposium on Combinatorial Pattern Matching
Construction and Evaluation of a Robust Multifeature Speech/Music Discriminator

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2
Prediction-driven computational auditory scene analysis

Prediction-driven computational auditory scene analysis
A new method of N-gram statistics for large number of n and automatic extraction of words and phrases from large text data of Japanese

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1

A comparative study on content-based music genre classification

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Mostly-unsupervised statistical segmentation of Japanese Kanji sequences

Natural Language Engineering
Automated Analysis of Nursing Home Observations

IEEE Pervasive Computing
Accessor variety criteria for Chinese word extraction

Computational Linguistics
Automatic multimedia cross-modal correlation discovery

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Fast Recognition of Musical Genres Using RBF Networks

IEEE Transactions on Knowledge and Data Engineering
Learning Domain Ontologies from Document Warehouses and Dedicated Web Sites

Computational Linguistics
Selecting the most highly correlated pairs within a large vocabulary

SEMANET '02 Proceedings of the 2002 workshop on Building and using semantic networks - Volume 11
Using masks, suffix array-based data structures and multidimensional arrays to compute positional ngram statistics from corpora

MWE '03 Proceedings of the ACL 2003 workshop on Multiword expressions: analysis, acquisition and treatment - Volume 18
A language model approach to keyphrase extraction

MWE '03 Proceedings of the ACL 2003 workshop on Multiword expressions: analysis, acquisition and treatment - Volume 18
Deconstructing Speech: new tools for speech manipulation

Organised Sound
Swordfish: an unsupervised Ngram based approach to morphological analysis

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Extracting key-substring-group features for text classification

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Reducing the human overhead in text categorization

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Understandable models Of music collections based on exhaustive feature generation with temporal statistics

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Content-based retrieval of music in scalable peer-to-peer networks

ICME '03 Proceedings of the 2003 International Conference on Multimedia and Expo - Volume 2
Measuring playlist diversity for recommendation systems

Proceedings of the 1st ACM workshop on Audio and music computing multimedia
Error mining for wide-coverage grammar engineering

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Scaling phrase-based statistical machine translation to larger corpora and longer phrases

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
A new suffix tree similarity measure for document clustering

Proceedings of the 16th international conference on World Wide Web
A Lazy Approach for Category Model Construction Using Training Texts

WI '06 Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence
MICE^3: An Information Desktop on the Web

WI '06 Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence
Sizing sketches: a rank-based analysis for similarity search

Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Integrating hyperinstruments, musical robots & machine musicianship for North Indian classical music

NIME '07 Proceedings of the 7th international conference on New interfaces for musical expression
Real-time feature-based synthesis for live musical performance

NIME '07 Proceedings of the 7th international conference on New interfaces for musical expression
A hybrid method for extended percussive gesture

NIME '07 Proceedings of the 7th international conference on New interfaces for musical expression
Asymmetric distance estimation with sketches for similarity search in high-dimensional spaces

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
MUSIPER: a system for modeling music similarity perception based on objective feature subset selection

User Modeling and User-Adapted Interaction
Computer Graphics Brazil: Content-based icons for music files

Computers and Graphics
TinyLex: static n-gram index pruning with perfect recall

Proceedings of the 17th ACM conference on Information and knowledge management
Modeling LSH for performance tuning

Proceedings of the 17th ACM conference on Information and knowledge management
Combination of audio and lyrics features for genre classification in digital audio collections

MM '08 Proceedings of the 16th ACM international conference on Multimedia
MarsyasX: multimedia dataflow processing with implicit patching

MM '08 Proceedings of the 16th ACM international conference on Multimedia
Efficient multi-word expressions extractor using suffix arrays and related structures

Proceedings of the 2nd ACM workshop on Improving non english web searching
Chants and Orcas: semi-automatic tools for audio annotation and analysis in niche domains

MS '08 Proceedings of the 2nd ACM workshop on Multimedia semantics
A Study on Multi-word Extraction from Chinese Documents

Advanced Web and NetworkTechnologies, and Applications
Feature selection with a measure of deviations from Poisson in text categorization

Expert Systems with Applications: An International Journal
Feature Analysis and Normalization Approach for Robust Content-Based Music Retrieval to Encoded Audio with Different Bit Rates

MMM '09 Proceedings of the 15th International Multimedia Modeling Conference on Advances in Multimedia Modeling
Substring Statistics

CICLing '09 Proceedings of the 10th International Conference on Computational Linguistics and Intelligent Text Processing
CompositeMap: a novel framework for music similarity measure

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Distributed language modeling for N-best list re-ranking

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Comprehensive query-dependent fusion using regression-on-folksonomies: a case study of multimodal music search

MM '09 Proceedings of the 17th ACM international conference on Multimedia
Learning Deep Web Crawling with Diverse Features

WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Stacked sequential learning

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Phrase Translation Extraction from Aligned Parallel Corpora Using Suffix Arrays and Related Structures

EPIA '09 Proceedings of the 14th Portuguese Conference on Artificial Intelligence: Progress in Artificial Intelligence
Exploiting genre for music emotion classification

ICME'09 Proceedings of the 2009 IEEE international conference on Multimedia and Expo
Clustering for music search results

ICME'09 Proceedings of the 2009 IEEE international conference on Multimedia and Expo
Intelligent mobile content-based retrieval from digital music libraries

Intelligent Decision Technologies
Integration of text and audio features for genre classification in music information retrieval

ECIR'07 Proceedings of the 29th European conference on IR research
The design, implementation, and use of the Ngram statistics package

CICLing'03 Proceedings of the 4th international conference on Computational linguistics and intelligent text processing
Audio data model for multi-criteria query formulation and retrieval

Proceedings of the 7th International Conference on Advances in Mobile Computing and Multimedia
Real-time timbral organisation: Selecting samples based upon similarity1

Organised Sound
Maximal phrases based analysis for prototyping online discussion forums postings

AdaptLRTtoND '09 Proceedings of the Workshop on Adaptation of Language Resources and Technology to New Domains
Camel: a lightweight framework for content-based audio and music analysis

Proceedings of the 5th Audio Mostly Conference: A Conference on Interaction with Sound
Determination of nonprototypical valence and arousal in popular music: features and performances

EURASIP Journal on Audio, Speech, and Music Processing - Special issue on scalable audio-content analysis
Large-scale music tag recommendation with explicit multiple attributes

Proceedings of the international conference on Multimedia
Generic architecture for event detection in broadcast sports video

Proceedings of the 3rd international workshop on Automated information extraction in media production
Recognition of instrument timbres in real polytimbral audio recordings

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part II
Music genre classification based on ensemble of signals produced by source separation methods

Intelligent Decision Technologies
Multi-modal music information retrieval: visualisation and evaluation of clusterings by both audio and lyrics

Large Scale Semantic Access to Content (Text, Image, Video, and Sound)
Web-based and combined language models: a case study on noun compound identification

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Search Results Clustering Based on Suffix Array and VSM

GREENCOM-CPSCOM '10 Proceedings of the 2010 IEEE/ACM Int'l Conference on Green Computing and Communications & Int'l Conference on Cyber, Physical and Social Computing
Optimized k-means clustering with intelligent initial centroid selection for web search using URL and tag contents

Proceedings of the International Conference on Web Intelligence, Mining and Semantics
Exploiting online music tags for music emotion classification

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP) - Special section on ACM multimedia 2010 best paper candidates, and issue on social media
Cunei: open-source machine translation with relevance-based models of each translation instance

Machine Translation
A musical mood trajectory estimation method using lyrics and acoustic features

MIRUM '11 Proceedings of the 1st international ACM workshop on Music information retrieval with user-centered and multimodal strategies
Automatic sentiment classification of product reviews using maximal phrases based analysis

WASSA '11 Proceedings of the 2nd Workshop on Computational Approaches to Subjectivity and Sentiment Analysis
Time-Space ensemble strategies for automatic music genre classification

IBERAMIA-SBIA'06 Proceedings of the 2nd international joint conference, and Proceedings of the 10th Ibero-American Conference on AI 18th Brazilian conference on Advances in Artificial Intelligence
Some issues on detecting emotions in music

RSFDGrC'05 Proceedings of the 10th international conference on Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing - Volume Part II
Extracting emotions from music data

ISMIS'05 Proceedings of the 15th international conference on Foundations of Intelligent Systems
Towards extracting emotions from music

IMTCI'04 Proceedings of the Second international conference on Intelligent Media Technology for Communicative Intelligence
Efficient retrieval of tree translation examples for syntax-based machine translation

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Efficient deep web crawling using reinforcement learning

PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
Syntactic language modeling with formal grammars

Speech Communication
A Cascade-Hybrid Music Recommender System for mobile services based on musical genre classification and personality diagnosis

Multimedia Tools and Applications
The CoMIRVA toolkit for visualizing music-related data

EUROVIS'07 Proceedings of the 9th Joint Eurographics / IEEE VGTC conference on Visualization
Personalized music emotion classification via active learning

Proceedings of the second international ACM workshop on Music information retrieval with user-centered and multimodal strategies
Suffix arrays on words

CPM'07 Proceedings of the 18th annual conference on Combinatorial Pattern Matching
Computing n-gram statistics in MapReduce

Proceedings of the 16th International Conference on Extending Database Technology
Machine learning as an objective approach to understanding music

NFMCP'12 Proceedings of the First international conference on New Frontiers in Mining Complex Patterns
The application of kalman filter based human-computer learning model to chinese word segmentation

CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I
Learning to crawl deep web

Information Systems
Juggling the Jigsaw: towards automated problem inference from network trouble tickets

nsdi'13 Proceedings of the 10th USENIX conference on Networked Systems Design and Implementation
Leveraging viewer comments for mood classification of music video clips

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Songrium: a music browsing assistance service based on visualization of massive open collaboration within music content creation community

Proceedings of the 9th International Symposium on Open Collaboration
Bridging the semantic gap in multimedia emotion/mood recognition for ubiquitous computing environment

The Journal of Supercomputing
Capturing the workflows of music information retrieval for repeatability and reuse

Journal of Intelligent Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Existing audio tools handle the increasing amount of computer audio data inadequately. The typical tape-recorder paradigm for audio interfaces is inflexible and time consuming, especially for large data sets. On the other hand, completely automatic audio analysis and annotation is impossible using current techniques. Alternative solutions are semi-automatic user interfaces that let users interact with sound in flexible ways based on content. This approach offers significant advantages over manual browsing, annotation and retrieval. Furthermore, it can be implemented using existing techniques for audio content analysis in restricted domains. This paper describes MARSYAS, a framework for experimenting, evaluating and integrating such techniques. As a test for the architecture, some recently proposed techniques have been implemented and tested. In addition, a new method for temporal segmentation based on audio texture is described. This method is combined with audio analysis techniques and used for hierarchical browsing, classification and annotation of audio files.