Mining e-mail content for author identification forensics
ACM SIGMOD Record
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Guide to Biometrics
HICSS '04 Proceedings of the Proceedings of the 37th Annual Hawaii International Conference on System Sciences (HICSS'04) - Track 1 - Volume 1
Journal of the American Society for Information Science and Technology
Computational Methods of Feature Selection (Chapman & Hall/Crc Data Mining and Knowledge Discovery Series)
A stability index for feature selection
AIAP'07 Proceedings of the 25th conference on Proceedings of the 25th IASTED International Multi-Conference: artificial intelligence and applications
ACM Transactions on Information Systems (TOIS)
Automatically profiling the author of an anonymous text
Communications of the ACM - Inspiring Women in Computing
Stylometric Identification in Electronic Markets: Scalability and Robustness
Journal of Management Information Systems
A survey of modern authorship attribution methods
Journal of the American Society for Information Science and Technology
Social signal processing: Survey of an emerging domain
Image and Vision Computing
A unified data mining solution for authorship analysis in anonymous textual communications
Information Sciences: an International Journal
Hi-index | 0.00 |
Authorship attribution (AA) aims at recognizing automatically the author of a given text sample. Traditionally applied to literary texts, AA faces now the new challenge of recognizing the identity of people involved in chat conversations. These share many aspects with spoken conversations, but AA approaches did not take it into account so far. Hence, this paper tries to fill the gap and proposes two novelties that improve the effectiveness of traditional AA approaches for this type of data: the first is to adopt features inspired by Conversation Analysis (in particular for turn-taking), the second is to extract the features from individual turns rather than from entire conversations. The experiments have been performed over a corpus of dyadic chat conversations (77 individuals in total). The performance in identifying the persons involved in each exchange, measured in terms of area under the Cumulative Match Characteristic curve, is 89.5%.