Sequential organization of speech in computational auditory scene analysis

Authors:
Yang Shao;DeLiang Wang
Affiliations:
Department of Computer Science and Engineering, The Ohio State University, 2015 Neil Ave., Columbus, OH 43210, USA;Department of Computer Science and Engineering, The Ohio State University, 2015 Neil Ave., Columbus, OH 43210, USA and Center for Cognitive Science, The Ohio State University, Columbus, OH 43210, ...
Venue:
Speech Communication
Year:
2009

Citing 11
Cited 1

Assessment for automatic speech recognition II: NOISEX-92: a database and an experiment to study the effect of additive noise on speech recognition systems

Speech Communication - Special issue on speech processing in adverse conditions
Artificial intelligence: a modern approach

Artificial intelligence: a modern approach
Speaker identification and verification using Gaussian mixture speaker models

Speech Communication
Computational Auditory Scene Analysis: Principles, Algorithms, and Applications

Computational Auditory Scene Analysis: Principles, Algorithms, and Applications
A tutorial on text-independent speaker verification

EURASIP Journal on Applied Signal Processing
Sequential organization in computational auditory scene analysis

Sequential organization in computational auditory scene analysis
Transforming Binary Uncertainties for Robust Speech Recognition

IEEE Transactions on Audio, Speech, and Language Processing
Model-based sequential organization in cochannel speech

IEEE Transactions on Audio, Speech, and Language Processing
Average divergence distance as a statistical discrimination measure for hidden Markov models

IEEE Transactions on Audio, Speech, and Language Processing
On the efficient evaluation of probabilistic similarity functions for image retrieval

IEEE Transactions on Information Theory
Monaural speech segregation based on pitch tracking and amplitude modulation

IEEE Transactions on Neural Networks

Sequential organization of speech in reverberant environments by integrating monaural grouping and binaural localization

IEEE Transactions on Audio, Speech, and Language Processing - Special issue on processing reverberant speech: methodologies and applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

A human listener has the ability to follow a speaker's voice over time in the presence of other talkers and non-speech interference. This paper proposes a general system for sequential organization of speech based on speaker models. By training a general background model, the proposed system is shown to function well with both interfering talkers and non-speech intrusions. To deal with situations where prior information about specific speakers is not available, a speaker quantization method is employed to extract representative models from a large speaker space and obtained generic models are used to perform sequential grouping. Our systematic evaluations show that grouping performance using generic models is only moderately lower than the performance level achieved with known speaker models.