Glissando: a corpus for multidisciplinary prosodic studies in Spanish and Catalan

Authors:
Juan María Garrido;David Escudero;Lourdes Aguilar;Valentín Cardeñoso;Emma Rodero;Carme De-La-Mota;César González;Carlos Vivaracho;Sílvia Rustullet;Olatz Larrea;Yesika Laplaza;Francisco Vizcaíno;Eva Estebas;Mercedes Cabrera;Antonio Bonafonte
Affiliations:
Computational Linguistics Group (GLiCom), Department of Translation and Language Sciences, Universitat Pompeu Fabra, Barcelona, Spain;Department of Computer Sciences, Universidad de Valladolid, Valladolid, Spain;Department of Spanish Philology, Universitat Autònoma de Barcelona, Barcelona, Spain;Department of Computer Sciences, Universidad de Valladolid, Valladolid, Spain;Department of Communication, Universitat Pompeu Fabra, Barcelona, Spain;Department of Spanish Philology, Universitat Autònoma de Barcelona, Barcelona, Spain;Department of Computer Sciences, Universidad de Valladolid, Valladolid, Spain;Department of Computer Sciences, Universidad de Valladolid, Valladolid, Spain;Computational Linguistics Group (GLiCom), Department of Translation and Language Sciences, Universitat Pompeu Fabra, Barcelona, Spain;Department of Communication, Universitat Pompeu Fabra, Barcelona, Spain;Computational Linguistics Group (GLiCom), Department of Translation and Language Sciences, Universitat Pompeu Fabra, Barcelona, Spain;Department of Modern Languages, Universidad de las Palmas de Gran Canaria, Las Palmas de Gran Canaria, Spain;Department of Modern Languages, Universidad Nacional de Educación a Distancia, Madrid, Spain;Department of Modern Languages, Universidad de las Palmas de Gran Canaria, Las Palmas de Gran Canaria, Spain;Department of Signal Theory and Communications, Universitat Politècnica de Catalunya, Barcelona, Spain
Venue:
Language Resources and Evaluation
Year:
2013

Citing 8
Cited 0

Developments and paradigms in intonation research

Speech Communication
Spoken Language Processing: A Guide to Theory, Algorithm, and System Development

Spoken Language Processing: A Guide to Theory, Algorithm, and System Development
Guidelines for Electronic Text Encoding and Interchange: Volumes 1 and 2: P4

Guidelines for Electronic Text Encoding and Interchange: Volumes 1 and 2: P4
Applying data mining techniques to corpus based prosodic modeling

Speech Communication
Analysis of inconsistencies in cross-lingual automatic ToBI tonal accent labeling

TSD'11 Proceedings of the 14th international conference on Text, speech and dialogue
Cross-lingual English Spanish tonal accent labeling using decision trees and neural networks

NOLISP'11 Proceedings of the 5th international conference on Advances in nonlinear speech processing
Production of filled pauses in concatenative speech synthesis based on the underlying fluent sentence

Speech Communication
Analysis of inter-transcriber consistency in the Cat_ToBI prosodic labeling system

Speech Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

Literature review on prosody reveals the lack of corpora for prosodic studies in Catalan and Spanish. In this paper, we present a corpus intended to fill this gap. The corpus comprises two distinct data-sets, a news subcorpus and a dialogue subcorpus, the latter containing either conversational or task-oriented speech. More than 25 h were recorded by twenty eight speakers per language. Among these speakers, eight were professional (four radio news broadcasters and four advertising actors). The entire material presented here has been transcribed, aligned with the acoustic signal and prosodically annotated. Two major objectives have guided the design of this project: (i) to offer a wide coverage of representative real-life communicative situations which allow for the characterization of prosody in these two languages; and (ii) to conduct research studies which enable us to contrast the speakers different speaking styles and discursive practices. All material contained in the corpus is provided under a Creative Commons Attribution 3.0 Unported License.