Cocktail party processing

Authors:
DeLiang Wang;Guoning Hu
Affiliations:
Department of Computer Science and Engineering & Center for Cognitive Science, The Ohio State University, Columbus, OH;Biophysics Program, The Ohio State University, Columbus, OH
Venue:
WCCI'08 Proceedings of the 2008 IEEE world conference on Computational intelligence: research frontiers
Year:
2008

Citing 14
Cited 0

Modelling auditory processing and organisation

Modelling auditory processing and organisation
Robust automatic speech recognition with missing and unreliable acoustic data

Speech Communication
Spoken Language Processing: A Guide to Theory, Algorithm, and System Development

Spoken Language Processing: A Guide to Theory, Algorithm, and System Development
Proceedings of the First International Conference on Scale-Space Theory in Computer Vision

SCALE-SPACE '97 Proceedings of the First International Conference on Scale-Space Theory in Computer Vision
A theory and computational model of auditory monaural sound separation (stream, speech enhancement, selective attention, pitch perception, noise cancellation)

A theory and computational model of auditory monaural sound separation (stream, speech enhancement, selective attention, pitch perception, noise cancellation)
Prediction-driven computational auditory scene analysis

Prediction-driven computational auditory scene analysis
Sound source separation via computational auditory scene analysis (casa)-enhanced beamforming

Sound source separation via computational auditory scene analysis (casa)-enhanced beamforming
The Cocktail Party Problem

Neural Computation
Computational Auditory Scene Analysis: Principles, Algorithms, and Applications

Computational Auditory Scene Analysis: Principles, Algorithms, and Applications
Sequential organization in computational auditory scene analysis

Sequential organization in computational auditory scene analysis
Auditory Segmentation Based on Onset and Offset Analysis

IEEE Transactions on Audio, Speech, and Language Processing
Model-based sequential organization in cochannel speech

IEEE Transactions on Audio, Speech, and Language Processing
Separation of speech from interfering sounds based on oscillatory correlation

IEEE Transactions on Neural Networks
Monaural speech segregation based on pitch tracking and amplitude modulation

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

Speech segregation, or the cocktail party problem, has proven to be an extremely challenging problem. This chapter describes a computational auditory scene analysis (CASA) approach to the cocktail party problem. This monaural approach performs auditory segmentation and grouping in a two-dimensional time-frequency representation that encodes proximity in frequency and time, periodicity, amplitude modulation, and onset/offset. In segmentation, our model decomposes the input mixture into contiguous time-frequency segments. Grouping is first performed for voiced speech where detected pitch contours are used to group voiced segments into a target stream and the background. In grouping voiced speech, resolved and unresolved harmonics are dealt with differently. Grouping of unvoiced segments is based on the Bayesian classification of acoustic-phonetic features. This CASA approach has led to major advances towards solving the cocktail party problem.