Automatic scoring of pronunciation quality
Speech Communication
Automatic Pronunciation Scoring for Language Instruction
ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2
An unsupervised method for detecting grammatical errors
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Automatic scoring of non-native spontaneous speech in tests of spoken English
Speech Communication
Part-of-speech histograms for genre classification of text
ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
The ups and downs of preposition error detection in ESL writing
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Towards using structural events to assess non-native speech
IUNLPBEA '10 Proceedings of the NAACL HLT 2010 Fifth Workshop on Innovative Use of NLP for Building Educational Applications
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Detecting structural events for assessing non-native speech
IUNLPBEA '11 Proceedings of the 6th Workshop on Innovative Use of NLP for Building Educational Applications
Spoken Language Derived Measures for Detecting Mild Cognitive Impairment
IEEE Transactions on Audio, Speech, and Language Processing
Hi-index | 0.00 |
This study presents a novel method that measures English language learners' syntactic competence towards improving automated speech scoring systems. In contrast to most previous studies which focus on the length of production units such as the mean length of clauses, we focused on capturing the differences in the distribution of morpho-syntactic features or grammatical expressions across proficiency. We estimated the syntactic competence through the use of corpus-based NLP techniques. Assuming that the range and sophistication of grammatical expressions can be captured by the distribution of Part-of-Speech (POS) tags, vector space models of POS tags were constructed. We use a large corpus of English learners' responses that are classified into four proficiency levels by human raters. Our proposed feature measures the similarity of a given response with the most proficient group and is then estimates the learner's syntactic competence level. Widely outperforming the state-of-the-art measures of syntactic complexity, our method attained a significant correlation with human-rated scores. The correlation between human-rated scores and features based on manual transcription was 0.43 and the same based on ASR-hypothesis was slightly lower, 0.42. An important advantage of our method is its robustness against speech recognition errors not to mention the simplicity of feature generation that captures a reasonable set of learner-specific syntactic errors.