Toward developing a very big sign language parallel corpus

Authors:
Achraf Othman;Zouhour Tmar;Mohamed Jemni
Affiliations:
Research Laboratory LaTICE, University of Tunis, Tunis, Bab Mnara, Tunisia;Research Laboratory LaTICE, University of Tunis, Tunis, Bab Mnara, Tunisia;Research Laboratory LaTICE, University of Tunis, Tunis, Bab Mnara, Tunisia
Venue:
ICCHP'12 Proceedings of the 13th international conference on Computers Helping People with Special Needs - Volume Part II
Year:
2012

Citing 3
Cited 0

Enriching the knowledge sources used in a maximum entropy part-of-speech tagger

EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
Evaluation of American Sign Language Generation by Native ASL Signers

ACM Transactions on Accessible Computing (TACCESS)
Sentence boundary detection and the problem with the U.S.

NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers

Quantified Score

Hi-index	0.00

Visualization

Abstract

The Community for researchers in the field of sign language is facing a serious problem which is the absence of a large parallel corpus for signs language. The ASLG-PC12 project, conducted in our laboratory, proposes a rule-based approach for building big parallel corpus between English written texts and American Sign Language Gloss. In this paper, we present a new algorithm to transform a part of English-speech sentence to ASL gloss. This project was started in the beginning of 2011 and it offers today a corpus containing more than one hundred million pairs of sentences between English and ASL gloss. It is available online for free in order to develop and design new algorithms and theories for Sign Language processing, for instance, statistical machine translation and any related fields. We present, in particular, the tasks for generating ASL sentences from the corpus Gutenberg Project that contains only English written texts.