Automatic detection of edited parts in inexact transcribed corpora based on alignment between edited transcription and corresponding utterance

Authors:
Kengo Ohta;Masatoshi Tsuchiya;Seiichi Nakagawa
Affiliations:
Toyohashi University of Technology, Dept. of Computer Sciences and Engineering, Japan;Toyohashi University of Technology, Information Media Center, Japan;Toyohashi University of Technology, Dept. of Computer Sciences and Engineering, Japan
Venue:
ROCOM'11/MUSP'11 Proceedings of the 11th WSEAS international conference on robotics, control and manufacturing technology, and 11th WSEAS international conference on Multimedia systems & signal processing
Year:
2011

Citing 2
Cited 0

Spoken Language Processing: A Guide to Theory, Algorithm, and System Development

Spoken Language Processing: A Guide to Theory, Algorithm, and System Development
Language model transformation applied to lightly supervised training of acoustic model for congress meetings

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

The availability of a large-scale spontaneous speech corpora is crucially important for various domains of spoken language processing. However, the available corpora are usually limited because of its cost to prepare. On the other hand, inexact transcribed corpora have been widely produced in the form of shorthand notes, meeting records, or closed captions. Although these inexact transcribed corpora are more freely available than faithful/exact ones, these are not faithfully transcribed but contains edited transcriptions. Under this background, we are considering to build an efficient semi-automatic framework for converting inexact transcripts to faithful ones or exact transcriptions. This framework consists of two steps: the first step is to automatically detect positions of edited parts, and the second step is to manually transcribe the edited parts. This paper proposes an automatic detection method of edited parts in edited transcribed corpora for this framework. In our proposed method, an automatic alignment between edited transcription and its corresponding utterance is performed, and then a support vector machine based detector is applied to detect edited parts using some features obtained by the automatic alignment. As a result of evaluation on the Japanese National Diet Record, a reasonable result was obtained in speaker-closed condition.