ACM SIGIR Forum
Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Mining e-mail content for author identification forensics
ACM SIGMOD Record
Gender-Preferential Text Mining of E-mail Discourse
ACSAC '02 Proceedings of the 18th Annual Computer Security Applications Conference
Automatic text categorization in terms of genre and author
Computational Linguistics
Proceedings of the 13th international conference on World Wide Web
From fingerprint to writeprint
Communications of the ACM - Supporting exploratory search
A temporal based forensic analysis of electronic communication
dg.o '06 Proceedings of the 2006 international conference on Digital government research
Adding Semantics to Email Clustering
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Linguistic correlates of style: authorship classification with deep linguistic analysis features
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
ACM Transactions on Information Systems (TOIS)
Stylometric Identification in Electronic Markets: Scalability and Robustness
Journal of Management Information Systems
Computational methods in authorship attribution
Journal of the American Society for Information Science and Technology
Authorship analysis in cybercrime investigation
ISI'03 Proceedings of the 1st NSF/NIJ conference on Intelligence and security informatics
A novel approach of mining write-prints for authorship attribution in e-mail forensics
Digital Investigation: The International Journal of Digital Forensics & Incident Response
HotSec'11 Proceedings of the 6th USENIX conference on Hot topics in security
Proceedings of the 4th ACM workshop on Security and artificial intelligence
A unified data mining solution for authorship analysis in anonymous textual communications
Information Sciences: an International Journal
Semi-random subspace method for writeprint identification
Neurocomputing
Simplified features for email authorship identification
International Journal of Security and Networks
Using lexicometry and vocabulary analysis techniques to detect a signature for web profile
Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
Hi-index | 0.00 |
Many criminals exploit the convenience of anonymity in the cyber world to conduct illegal activities. E-mail is the most commonly used medium for such activities. Extracting knowledge and information from e-mail text has become an important step for cybercrime investigation and evidence collection. Yet, it is one of the most challenging and time-consuming tasks due to special characteristics of e-mail dataset. In this paper, we focus on the problem of mining the writing styles from a collection of e-mails written by multiple anonymous authors. The general idea is to first cluster the anonymous e-mail by the stylometric features and then extract the writeprint, i.e., the unique writing style, from each cluster. We emphasize that the presented problem together with our proposed solution is different from the traditional problem of authorship identification, which assumes training data is available for building a classifier. Our proposed method is particularly useful in the initial stage of investigation, in which the investigator usually have very little information of the case and the true authors of suspicious e-mail collection. Experiments on a real-life dataset suggest that clustering by writing style is a promising approach for grouping e-mails written by the same author.