Diction based prosody modeling in table-to-speech synthesis

Authors:
Dimitris Spiliotopoulos;Gerasimos Xydas;Georgios Kouroupetroglou
Affiliations:
Department of Informatics and Telecommunications, University of Athens;Department of Informatics and Telecommunications, University of Athens;Department of Informatics and Telecommunications, University of Athens
Venue:
TSD'05 Proceedings of the 8th international conference on Text, Speech and Dialogue
Year:
2005

Citing 8
Cited 4

An automated approach for retrieving hierarchical data from HTML tables

Proceedings of the eighth international conference on Information and knowledge management
A domain specific language framework for non-visual browsing of complex HTML structures

Assets '00 Proceedings of the fourth international ACM conference on Assistive technologies
Improving the accessibility of aurally rendered HTML tables

Proceedings of the fifth international ACM conference on Assistive technologies
Navigation of HTML tables, frames, and XML fragments

Proceedings of the fifth international ACM conference on Assistive technologies
Layout and Language: Preliminary Investigations in Recognizing the Structure of Tables

ICDAR '97 Proceedings of the 4th International Conference on Document Analysis and Recognition
Flexible Web Document Analysis for Delivery to Narrow-Bandwidth Devices

ICDAR '01 Proceedings of the Sixth International Conference on Document Analysis and Recognition
Detection, Extraction and Representation of Tables

ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 1
Rendering tables in audio: the interaction of structure and reading styles

Assets '04 Proceedings of the 6th international ACM SIGACCESS conference on Computers and accessibility

Acoustic Rendering of Data Tables Using Earcons and Prosody for Document Accessibility

UAHCI '09 Proceedings of the 5th International Conference on Universal Access in Human-Computer Interaction. Part III: Applications and Services
Acoustic modeling of dialogue elements for document accessibility

UAHCI'11 Proceedings of the 6th international conference on Universal access in human-computer interaction: applications and services - Volume Part IV
Modeling reader's emotional state response on document's typographic elements

Advances in Human-Computer Interaction
Setting the table for the blind

Proceedings of the 4th International Conference on PErvasive Technologies Related to Assistive Environments

Quantified Score

Hi-index	0.00

Visualization

Abstract

Transferring a structure from the visual modality to the aural one presents a difficult challenge. In this work we are experimenting with prosody modeling for the synthesized speech representation of tabulated structures. This is achieved by analyzing naturally spoken descriptions of data tables and a following feedback by blind and sighted users. The derived prosodic phrase accent and pause break placement and values are examined in terms of successfully conveying semantically important visual information through prosody control in Table-to-Speech synthesis. Finally, the quality of the information provision of synthesized tables when utilizing the proposed prosody specification is studied against plain synthesis.