New Feature Sets for Summarization by Sentence Extraction
IEEE Intelligent Systems
Linguistic profiling for author recognition and verification
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Author verification by linguistic profiling: An exploration of the parameter space
ACM Transactions on Speech and Language Processing (TSLP)
Linguistic profiling for author recognition and verification
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Source language markers in EUROPARL translations
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Hi-index | 0.00 |
In order to control the quality of internet-based language corpora, we developed a method to verify automatically that texts are of (near-) native quality. For the LOCNESS and ICLE corpora, the method is rather successful in separating native and non-native learner texts. The Equal Error Rate is about 10%. However, for other domains, such as internet texts, separate classifiers have to be trained on the basis of suitable seed corpora.