A vector space model for automatic indexing
Communications of the ACM
Ontologies Improve Text Document Clustering
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Search Engines: Information Retrieval in Practice
Search Engines: Information Retrieval in Practice
Automatic scoring of non-native spontaneous speech in tests of spoken English
Speech Communication
WordNet::Similarity: measuring the relatedness of concepts
HLT-NAACL--Demonstrations '04 Demonstration Papers at HLT-NAACL 2004
Towards automatic scoring of a test of spoken language with heterogeneous task types
EANL '08 Proceedings of the Third Workshop on Innovative Use of NLP for Building Educational Applications
Hi-index | 0.00 |
This paper presents an exploration into automated content scoring of non-native spontaneous speech using ontology-based information to enhance a vector space approach. We use content vector analysis as a baseline and evaluate the correlations between human rater proficiency scores and two cosine-similarity-based features, previously used in the context of automated essay scoring. We use two ontology-facilitated approaches to improve feature correlations by exploiting the semantic knowledge encoded in WordNet: (1) extending word vectors with semantic concepts from the WordNet ontology (synsets); and (2) using a reasoning approach for estimating the concept weights of concepts not present in the set of training responses by exploiting the hierarchical structure of WordNet. Furthermore, we compare features computed from human transcriptions of spoken responses with features based on output from an automatic speech recognizer. We find that (1) for one of the two features, both ontologically based approaches improve average feature correlations with human scores, and that (2) the correlations for both features decrease only marginally when moving from human speech transcriptions to speech recognizer output.