Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
Authorship Attribution with Support Vector Machines
Applied Intelligence
Determining an author's native language by mining a text for errors
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Automatically determining an anonymous author's native language
ISI'05 Proceedings of the 2005 IEEE international conference on Intelligence and Security Informatics
Exploiting parse structures for native language identification
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Using psycholinguistic features for profiling first language of authors
Journal of the American Society for Information Science and Technology
Stylometric analysis of scientific articles
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Exploring adaptor grammars for native language identification
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Hi-index | 0.00 |
We apply machine learning techniques to study language transfer, a major topic in the theory of Second Language Acquisition (SLA). Using an SVM for the problem of native language classification, we show that a careful analysis of the effects of various features can lead to scientific insights. In particular, we demonstrate that character bigrams alone allow classification levels of about 66% for a 5-class task, even when content and function word differences are accounted for. This may show that native language has a strong effect on the word choice of people writing in a second language.