Speech activated telephony email reader (SATER) based on speaker verification and text-to-speech conversion

Authors:
Chung-Hsien Wu;Jau-Hung Chen
Affiliations:
Inst. of Inf. Eng., Nat. Cheng Kung Univ., Tainan;-
Venue:
IEEE Transactions on Consumer Electronics
Year:
1997

Citing 0
Cited 3

A speech synthesizer for Persian text using a neural network with a smooth ergodic HMM

ACM Transactions on Asian Language Information Processing (TALIP)
Long-Term Animal Observation by Wireless Sensor Networks with Sound Recognition

WASA '09 Proceedings of the 4th International Conference on Wireless Algorithms, Systems, and Applications
Implementation of Three Text to Speech Systems for Kurdish Language

CIARP '09 Proceedings of the 14th Iberoamerican Conference on Pattern Recognition: Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications

Quantified Score

Hi-index	0.43

Visualization

Abstract

A speech activated telephony email reader (SATER) is proposed. SATER is an integrated system combining speaker verification, network, and text-to-speech conversion. A registered user can activate and listen to his own email through a wired/wireless telephone. In the speaker verification subsystem, a time-varying and speaker-dependent verification phrase is adopted. The speaker's password is used to generate the verification phrases for that speaker. A hidden Markov Model with states of a variable number is used to model each verification phrase. In the text-to-speech (TTS) subsystem, a prosody modification approach is proposed on the basis of word units. Appropriate word prosodic patterns in a sentence are selected from a word prosody database using linguistic features. This system has been tested on 20 subjects. In the speaker verification test, at 1.5% false rejection, the verification system resulted in 0.5% false acceptance. The results for the TTS conversion system indicated that the average correct rate was 95.7% for intelligibility, and that the mean opinion score (MOS) was 3.4 for naturalness