Software forensics: can we track code to its authors?
Computers and Security
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Automatic detection of text genre
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Segmenting documents by stylistic character
Natural Language Engineering
Foundations and Trends in Information Retrieval
Chat mining: Predicting user and message attributes in computer-mediated communication
Information Processing and Management: an International Journal
Investigating the statistical properties of user-generated documents
FQAS'11 Proceedings of the 9th international conference on Flexible Query Answering Systems
Language independent gender classification on Twitter
Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
Hi-index | 0.00 |
The aim of this paper is to investigate the feasibility of predicting the gender of a text document's author using linguistic evidence. For this purpose, term- and style-based classification techniques are evaluated over a large collection of chat messages. Prediction accuracies up to 84.2% are achieved, illustrating the applicability of these techniques to gender prediction. Moreover, the reverse problem is exploited, and the effect of gender on the writing style is discussed.