Sphinx-4: a flexible open source framework for speech recognition

Authors:
Willie Walker;Paul Lamere;Philip Kwok;Bhiksha Raj;Rita Singh;Evandro Gouvea;Peter Wolf;Joe Woelfel
Affiliations:
Sun Microsystems;Sun Microsystems;Sun Microsystems;Mitsubishi Electric Research Labs;Carnegie Mellon University;Carnegie Mellon University;Mitsubishi Electric Research Labs;Mitsubishi Electric Research Labs
Venue:
Sphinx-4: a flexible open source framework for speech recognition
Year:
2004

Citing 3
Cited 32

Statistical methods for speech recognition

Statistical methods for speech recognition
The harpy speech recognition system.

The harpy speech recognition system.
Finite-state transducers in language and speech processing

Computational Linguistics

Using context and sensory data to learn first and second person pronouns

Proceedings of the 1st ACM SIGCHI/SIGART conference on Human-robot interaction
Salience modeling based on non-verbal modalities for spoken language understanding

Proceedings of the 8th international conference on Multimodal interfaces
Isolated word recognition with the liquid state machine: a case study

Information Processing Letters - Special issue on applications of spiking neural networks
Totalrecall: visualization and semi-automatic annotation of very large audio-visual corpora

Proceedings of the 9th international conference on Multimodal interfaces
Strengths and weaknesses of software architectures for the rapid creation of tangible and multimodal interfaces

Proceedings of the 2nd international conference on Tangible and embedded interaction
Achieving fluency through perceptual-symbol practice in human-robot collaboration

Proceedings of the 3rd ACM/IEEE international conference on Human robot interaction
Beyond attention: the role of deictic gesture in intention recognition in multimodal conversational interfaces

Proceedings of the 13th international conference on Intelligent user interfaces
A Generic Spoken Dialogue Manager Applied to an Interactive 2D Game

PIT '08 Proceedings of the 4th IEEE tutorial and research workshop on Perception and Interactive Technologies for Speech-Based Systems: Perception in Multimodal Dialogue Systems
Tailoring the Interpretation of Spatial Utterances for Playing a Board Game

AIMSA '08 Proceedings of the 13th international conference on Artificial Intelligence: Methodology, Systems, and Applications
MEMODULES as Tangible Shortcuts to Multimedia Information

Human Machine Interaction
Anticipatory perceptual simulation for human-robot joint practice: theory and application study

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 3
Isolated word recognition with the Liquid State Machine: a case study

Information Processing Letters - Special issue on applications of spiking neural networks
Effects of anticipatory perceptual simulation on practiced human-robot tasks

Autonomous Robots
Spoken commands in a smart home: an iterative approach to the Sphinx algorithm

MICAI'07 Proceedings of the artificial intelligence 6th Mexican international conference on Advances in artificial intelligence
Word recognition with a hierarchical neural network

NOLISP'07 Proceedings of the 2007 international conference on Advances in nonlinear speech processing
Decaptcha: breaking 75% of eBay audio CAPTCHAs

WOOT'09 Proceedings of the 3rd USENIX conference on Offensive technologies
Everyone can do magic: an interactive game with speech and gesture recognition

ICEC'10 Proceedings of the 9th international conference on Entertainment computing
CanSpeak: a customizable speech interface for people with dysarthric speech

ICCHP'10 Proceedings of the 12th international conference on Computers helping people with special needs: Part I
Collaborating on utterances with a spoken dialogue system using an ISU-based approach to incremental dialogue management

SIGDIAL '10 Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Zanzibar OpenIVR: an open-source framework for development of spoken dialog systems

TSD'11 Proceedings of the 14th international conference on Text, speech and dialogue
Prosodic and temporal features for language modeling for dialog

Speech Communication
Modality switching and performance in a thought and speech controlled computer game

ICMI '11 Proceedings of the 13th international conference on multimodal interfaces
Predicting the micro-timing of user input for an incremental spoken dialogue system that completes a user's ongoing turn

SIGDIAL '11 Proceedings of the SIGDIAL 2011 Conference
A dynamical pattern recognition model of gamma activity in auditory cortex

Neural Networks
Open source WFST tools for LVCSR cascade development

FSMNLP '11 Proceedings of the 9th International Workshop on Finite State Methods and Natural Language Processing
SpeeG: a multimodal speech- and gesture-based text input solution

Proceedings of the International Working Conference on Advanced Visual Interfaces
The InproTK 2012 release

SDCTD '12 NAACL-HLT Workshop on Future Directions and Needs in the Spoken Dialog Community: Tools and Data
Modeling user behavior online for disambiguating user input in a spoken dialogue system

Speech Communication
Evaluating scala, actors, & ontologies for intelligent realtime interactive systems

Proceedings of the 18th ACM symposium on Virtual reality software and technology
A stereophonic acoustic signal extraction scheme for noisy and reverberant environments

Computer Speech and Language
Auditeur: a mobile-cloud service platform for acoustic event detection on smartphones

Proceeding of the 11th annual international conference on Mobile systems, applications, and services
EigenNews: a personalized news video delivery platform

Proceedings of the 21st ACM international conference on Multimedia

Quantified Score

Hi-index	0.00

Visualization

Abstract

Sphinx-4 is a flexible, modular and pluggable framework to help foster new innovations in the core research of hidden Markov model (HMM) speech recognition systems. The design of Sphinx-4 is based on patterns that have emerged from the design of past systems as well as new requirements based on areas that researchers currently want to explore. To exercise this framework, and to provide researchers with a "researchready" system, Sphinx-4 also includes several implementations of both simple and state-of-the-art techniques. The framework and the implementations are all freely available via open source.