A Hindi speech recognizer for an agricultural video search application

Authors:
Kalika Bali;Sunayana Sitaram;Sebastien Cuendet;Indrani Medhi
Affiliations:
Microsoft Research, Bangalore, India;Carnegie Mellon University;École Polytechnique, Fédérale de Lausanne, Switzerland;Microsoft Research, Bangalore, India
Venue:
Proceedings of the 3rd ACM Symposium on Computing for Development
Year:
2013

Citing 7
Cited 1

A large-vocabulary continuous speech recognition system for Hindi

IBM Journal of Research and Development
Speech interfaces for information access by low literate users

Speech interfaces for information access by low literate users
Mobile-izing health workers in rural India

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Small-vocabulary speech recognition for resource-scarce languages

Proceedings of the First ACM Symposium on Computing for Development
Discriminative pronunciation learning for speech recognition for resource scarce languages

Proceedings of the 2nd ACM Symposium on Computing for Development
Improving literacy in developing countries using speech recognition-supported games on mobile devices

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
VideoKheti: making video content accessible to low-literate and novice users

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems

VideoKheti: making video content accessible to low-literate and novice users

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Voice user interfaces for ICTD applications have immense potential in their ability to reach to a large illiterate or semi-literate population in these regions where text-based interfaces are of little use. However, building speech systems for a new language is a highly resource intensive task. There have been attempts in the past to develop techniques to circumvent the need for large amounts of data and technical expertise required to build such systems. In this paper we present the development and evaluation of an application specific speech recognizer for Hindi. We use the Salaam method [4] to bootstrap a high quality speech engine in English to develop a mobile speech based agricultural video search for farmers in India. With very little training data for a 79 word vocabulary we are able to achieve 90% accuracies for test and field deployments. We report some observations from field that we believe are critical to the effective development and usability of a speech application in ICTD.