Part of Speech (POS) Tag Sets Reduction and Analysis Using Rough Set Techniques
RSFDGrC '09 Proceedings of the 12th International Conference on Rough Sets, Fuzzy Sets, Data Mining and Granular Computing
Automatically classifying documents by ideological and organizational affiliation
ISI'09 Proceedings of the 2009 IEEE international conference on Intelligence and security informatics
e-mail authorship verification for forensic investigation
Proceedings of the 2010 ACM Symposium on Applied Computing
Text-based video content classification for online video-sharing sites
Journal of the American Society for Information Science and Technology
Which clustering do you want? inducing your ideal clustering with minimal feedback
Journal of Artificial Intelligence Research
Language Resources and Evaluation
Plagiarism and authorship analysis: introduction to the special issue
Language Resources and Evaluation
Authorship attribution in the wild
Language Resources and Evaluation
Local histograms of character N-grams for authorship attribution
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Unsupervised decomposition of a document into authorial components
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Lost in translation: authorship attribution using frame semantics
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
‘twazn me!!! ;(’ automatic authorship analysis of micro-blogging messages
NLDB'11 Proceedings of the 16th international conference on Natural language processing and information systems
Online conversation mining for author characterization and topic identification
Proceedings of the 4th workshop on Workshop for Ph.D. students in information & knowledge management
Detection of near-duplicate user generated contents: the SMS spam collection
Proceedings of the 3rd international workshop on Search and mining user-generated contents
A weighted profile intersection measure for profile-based authorship attribution
MICAI'11 Proceedings of the 10th Mexican international conference on Advances in Artificial Intelligence - Volume Part I
Authorship Attribution Based on Specific Vocabulary
ACM Transactions on Information Systems (TOIS)
Using psycholinguistic features for profiling first language of authors
Journal of the American Society for Information Science and Technology
Mining writeprints from anonymous e-mails for forensic investigation
Digital Investigation: The International Journal of Digital Forensics & Incident Response
A new document author representation for authorship attribution
MCPR'12 Proceedings of the 4th Mexican conference on Pattern Recognition
EACL 2012 Proceedings of the Workshop on Computational Approaches to Deception Detection
Authorship attribution with author-aware topic models
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
Exploring adaptor grammars for native language identification
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
On the role of poetic versus nonpoetic features in “kindred” and diachronic poetry attribution
Journal of the American Society for Information Science and Technology
Authorship attribution based on a probabilistic topic model
Information Processing and Management: an International Journal
Expert Systems with Applications: An International Journal
Explanation in computational stylometry
CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume 2
Simplified features for email authorship identification
International Journal of Security and Networks
Proceedings of the 1st International Workshop on Collaborative Annotations in Shared Environment: metadata, vocabularies and techniques in the Digital Humanities
Hi-index | 0.00 |
Statistical authorship attribution has a long history, culminating in the use of modern machine learning classification methods. Nevertheless, most of this work suffers from the limitation of assuming a small closed set of candidate authors and essentially unlimited training text for each. Real-life authorship attribution problems, however, typically fall short of this ideal. Thus, following detailed discussion of previous work, three scenarios are considered here for which solutions to the basic attribution problem are inadequate. In the first variant, the profiling problem, there is no candidate set at all; in this case, the challenge is to provide as much demographic or psychological information as possible about the author. In the second variant, the needle-in-a-haystack problem, there are many thousands of candidates for each of whom we might have a very limited writing sample. In the third variant, the verification problem, there is no closed candidate set but there is one suspect; in this case, the challenge is to determine if the suspect is or is not the author. For each variant, it is shown how machine learning methods can be adapted to handle the special challenges of that variant. © 2009 Wiley Periodicals, Inc.