A style analysis of C programs
Communications of the ACM - Special section on computer architecture
Pattern recognition: human and mechanical
Pattern recognition: human and mechanical
Programming style authorship analysis
CSC '89 Proceedings of the 17th conference on ACM Annual Computer Science Conference
Application of the Karhunen-Loeve Procedure for the Characterization of Human Faces
IEEE Transactions on Pattern Analysis and Machine Intelligence
Discrimination of authorship using visualization
Information Processing and Management: an International Journal
IEEE Transactions on Pattern Analysis and Machine Intelligence
Attribution accuracy when using anonymity in group support systems
International Journal of Human-Computer Studies - Special issue: group support systems
Fast training of support vector machines using sequential minimal optimization
Advances in kernel methods
Social translucence: an approach to designing systems that support social processes
ACM Transactions on Computer-Human Interaction (TOCHI) - Special issue on human-computer interaction in the new millennium, Part 1
Software piracy: a view from Hong Kong
Communications of the ACM
Mining e-mail content for author identification forensics
ACM SIGMOD Record
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Authorship Attribution with Support Vector Machines
Applied Intelligence
Gender-Preferential Text Mining of E-mail Discourse
ACSAC '02 Proceedings of the 18th Annual Computer Security Applications Conference
An introduction to variable and feature selection
The Journal of Machine Learning Research
An extensive empirical study of feature selection metrics for text classification
The Journal of Machine Learning Research
Mining concept-drifting data streams using ensemble classifiers
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Style mining of electronic messages for multiple authorship discrimination: first results
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Automatic text categorization in terms of genre and author
Computational Linguistics
Automatic authorship attribution
EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Extraction of Java program fingerprints for software authorship identification
Journal of Systems and Software
Applying Authorship Analysis to Extremist-Group Web Forum Messages
IEEE Intelligent Systems
Journal of the American Society for Information Science and Technology
From fingerprint to writeprint
Communications of the ACM - Supporting exploratory search
Journal of the American Society for Information Science and Technology
A survey of trust and reputation systems for online service provision
Decision Support Systems
Conversation Map: An Interface for Very Large-Scale Conversations
Journal of Management Information Systems
Identification of Comment Authorship in Anonymous Group Support Systems
Journal of Management Information Systems
Visualizing authorship for identification
ISI'06 Proceedings of the 4th IEEE international conference on Intelligence and Security Informatics
New algorithms for mining the reputation of participants of online auctions
WINE'05 Proceedings of the First international conference on Internet and Network Economics
A Cybercrime Forensic Method for Chinese Web Information Authorship Analysis
PAISI '09 Proceedings of the Pacific Asia Workshop on Intelligence and Security Informatics
Identifying Firm-Specific Risk Statements in News Articles
PAISI '09 Proceedings of the Pacific Asia Workshop on Intelligence and Security Informatics
A comparison of fraud cues and classification methods for fake escrow website detection
Information Technology and Management
ISI'09 Proceedings of the 2009 IEEE international conference on Intelligence and security informatics
Identification of extremist videos in online video sharing sites
ISI'09 Proceedings of the 2009 IEEE international conference on Intelligence and security informatics
e-mail authorship verification for forensic investigation
Proceedings of the 2010 ACM Symposium on Applied Computing
Text-based video content classification for online video-sharing sites
Journal of the American Society for Information Science and Technology
Authorship attribution in the wild
Language Resources and Evaluation
Social network analysis based on authorship identification for cybercrime investigation
PAISI'11 Proceedings of the 6th Pacific Asia conference on Intelligence and security informatics
‘twazn me!!! ;(’ automatic authorship analysis of micro-blogging messages
NLDB'11 Proceedings of the 16th international conference on Natural language processing and information systems
HotSec'11 Proceedings of the 6th USENIX conference on Hot topics in security
Authorship similarity detection from email messages
MLDM'11 Proceedings of the 7th international conference on Machine learning and data mining in pattern recognition
Using psycholinguistic features for profiling first language of authors
Journal of the American Society for Information Science and Technology
A novel approach of mining write-prints for authorship attribution in e-mail forensics
Digital Investigation: The International Journal of Digital Forensics & Incident Response
Towards an integrated e-mail forensic analysis framework
Digital Investigation: The International Journal of Digital Forensics & Incident Response
Mining writeprints from anonymous e-mails for forensic investigation
Digital Investigation: The International Journal of Digital Forensics & Incident Response
Evaluating sentiment in financial news articles
Decision Support Systems
Use fewer instances of the letter "i": toward writing style anonymization
PETS'12 Proceedings of the 12th international conference on Privacy Enhancing Technologies
A new document author representation for authorship attribution
MCPR'12 Proceedings of the 4th Mexican conference on Pattern Recognition
Sentimental Spidering: Leveraging Opinion Information in Focused Crawlers
ACM Transactions on Information Systems (TOIS)
Adversarial stylometry: Circumventing authorship recognition to preserve privacy and anonymity
ACM Transactions on Information and System Security (TISSEC)
Conversationally-inspired stylometric features for authorship attribution in instant messaging
Proceedings of the 20th ACM international conference on Multimedia
AI'12 Proceedings of the 25th Australasian joint conference on Advances in Artificial Intelligence
A unified data mining solution for authorship analysis in anonymous textual communications
Information Sciences: an International Journal
Semi-random subspace method for writeprint identification
Neurocomputing
Combining Entity Matching Techniques for Detecting Extremist Behavior on Discussion Boards
ASONAM '12 Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012)
Detecting multiple aliases in social media
Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
Simplified features for email authorship identification
International Journal of Security and Networks
Hi-index | 0.01 |
One of the problems often associated with online anonymity is that it hinders social accountability, as substantiated by the high levels of cybercrime. Although identity cues are scarce in cyberspace, individuals often leave behind textual identity traces. In this study we proposed the use of stylometric analysis techniques to help identify individuals based on writing style. We incorporated a rich set of stylistic features, including lexical, syntactic, structural, content-specific, and idiosyncratic attributes. We also developed the Writeprints technique for identification and similarity detection of anonymous identities. Writeprints is a Karhunen-Loeve transforms-based technique that uses a sliding window and pattern disruption algorithm with individual author-level feature sets. The Writeprints technique and extended feature set were evaluated on a testbed encompassing four online datasets spanning different domains: email, instant messaging, feedback comments, and program code. Writeprints outperformed benchmark techniques, including SVM, Ensemble SVM, PCA, and standard Karhunen-Loeve transforms, on the identification and similarity detection tasks with accuracy as high as 94% when differentiating between 100 authors. The extended feature set also significantly outperformed a baseline set of features commonly used in previous research. Furthermore, individual-author-level feature sets generally outperformed use of a single group of attributes.