The 'Neural' Phonetic Typewriter
Computer
PCM '02 Proceedings of the Third IEEE Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing
Fuzzy Audio Similarity Measures Based on Spectrum Histograms and Fluctuation Patterns
MUE '07 Proceedings of the 2007 International Conference on Multimedia and Ubiquitous Engineering
Automated speech and audio analysis for semantic access to multimedia
SAMT'06 Proceedings of the First international conference on Semantic and Digital Media Technologies
Toward language-independent text-to-speech synthesis
WSEAS Transactions on Information Science and Applications
Hi-index | 0.00 |
Interacting with multimedia information stored in systems or on the web points up several difficulties inherent in the signal nature of such information. These difficulties are especially evident when palmtop devices are used for such purposes. Developing and integrating a set of algorithms designed for extracting audio information is a primary step toward providing user-friendly access to multimedia information and developing powerful communication interfaces. Audio has several advantages over other communication media. These include: hands-free operation; unattended interaction; simple, cheap devices for capture and playback. A set of algorithms and processes for extracting semantic and syntactic information from audio signals, including voice, was defined. The extracted information was used to access information in multimedia databases, as well as to index it. More extensive, higher-level information, such as audio-source identification (speaker identification) and genre (in the case of music), must be extracted from the audio signal. One basic task involves transforming audio into symbols (e.g. music transformed into a score, speech transformed into text) and transcribing symbols into audio (e.g. score transformed into musical audio, text transformed into speech). The purpose is to search for and access any kind of multimedia information by means of audio. To attain these results, digital audio processing, digital speech processing, and soft-computing methods need to be integrated.