Promoting free dialog video corpora: the IFADV corpus example

Authors:
R. J. J. H. Van Son;Wieneke Wesseling;Eric Sanders;Henk Van Den Heuvel
Affiliations:
ACLC, IFA, University of Amsterdam, The Netherlands;ACLC, IFA, University of Amsterdam, The Netherlands;SPEX, CLST, Radboud University Nijmegen, The Netherlands;SPEX, CLST, Radboud University Nijmegen, The Netherlands
Venue:
Multimodal corpora
Year:
2009

Citing 3
Cited 0

The reliability of a dialogue structure coding scheme

Computational Linguistics
Intellectual property aspects of web publishing

Proceedings of the 22nd annual international conference on Design of communication: The engineering of quality documentation
Outline of the international standard linguistic annotation framework

LingAnnot ;03 Proceedings of the ACL 2003 workshop on Linguistic annotation: getting the model right - Volume 19

Quantified Score

Hi-index	0.00

Visualization

Abstract

Research into spoken language has become more visual over the years. Both fundamental and applied research have progressively included gestures, gaze, and facial expression. Corpora of multimodal conversational speech are rare and frequently difficult to use due to privacy and copyright restrictions. In contrast, Free-and-Libre corpora would allow anyone to add incremental annotations and improvement, distributing the cost of construction and maintenance. A freely available annotated corpus is presented with high quality video recordings of face-to-face conversational speech. An effort has been made to remove copyright and use restrictions. Annotations have been processed to RDBMS tables that allow SQL queries and direct connections to statistical software. A few simple examples are presented to illustrate the use of a databases of annotated speech. From our experiences we would like to advocate the formulation of "best practises" for both legal handling and database storage of recordings and annotations.