Instance-Based Learning Algorithms
Machine Learning
C4.5: programs for machine learning
C4.5: programs for machine learning
Overview of the second text retrieval conference (TREC-2)
TREC-2 Proceedings of the second conference on Text retrieval conference
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Modern Information Retrieval
Tractable Average-Case Analysis of Naive Bayesian Classifiers
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Authorship Attribution with Support Vector Machines
Applied Intelligence
The disputed federalist papers: SVM feature selection via concave minimization
Proceedings of the 2003 conference on Diversity in computing
Automatic authorship attribution
EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Collection selection for managed distributed document databases
Information Processing and Management: an International Journal
Language independent authorship attribution using character level language models
EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
Estimating continuous distributions in Bayesian classifiers
UAI'95 Proceedings of the Eleventh conference on Uncertainty in artificial intelligence
Searching with style: authorship attribution in classic literature
ACSC '07 Proceedings of the thirtieth Australasian conference on Computer science - Volume 62
A Web-Based Self-training Approach for Authorship Attribution
GoTAL '08 Proceedings of the 6th international conference on Advances in Natural Language Processing
A survey of modern authorship attribution methods
Journal of the American Society for Information Science and Technology
Using the Web as corpus for self-training text categorization
Information Retrieval
Application of Information Retrieval Techniques for Source Code Authorship Attribution
DASFAA '09 Proceedings of the 14th International Conference on Database Systems for Advanced Applications
Forensic Authorship Attribution Using Compression Distances to Prototypes
IWCF '09 Proceedings of the 3rd International Workshop on Computational Forensics
Entropy-based authorship search in large document collections
ECIR'07 Proceedings of the 29th European conference on IR research
Authorship attribution via combination of evidence
ECIR'07 Proceedings of the 29th European conference on IR research
Authorship classification: a syntactic tree mining approach
Proceedings of the ACM SIGKDD Workshop on Useful Patterns
Automatic authorship attribution for texts in croatian language using combinations of features
KES'10 Proceedings of the 14th international conference on Knowledge-based and intelligent information and engineering systems: Part II
Authorship attribution in the wild
Language Resources and Evaluation
Local histograms of character N-grams for authorship attribution
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Authorship classification: a discriminative syntactic tree mining approach
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Authorship attribution using word sequences
CIARP'06 Proceedings of the 11th Iberoamerican conference on Progress in Pattern Recognition, Image Analysis and Applications
Using relative entropy for authorship attribution
AIRS'06 Proceedings of the Third Asia conference on Information Retrieval Technology
Authorship Attribution Based on Specific Vocabulary
ACM Transactions on Information Systems (TOIS)
Characterizing stylistic elements in syntactic structure
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Authorship attribution based on a probabilistic topic model
Information Processing and Management: an International Journal
Authorship Detection and Encoding for eBay Images
International Journal of Multimedia Data Engineering & Management
A unified data mining solution for authorship analysis in anonymous textual communications
Information Sciences: an International Journal
Hi-index | 0.00 |
Techniques for identifying the author of an unattributed document can be applied to problems in information analysis and in academic scholarship. A range of methods have been proposed in the research literature, using a variety of features and machine learning approaches, but the methods have been tested on very different data and the results cannot be compared. It is not even clear whether the differences in performance are due to feature selection or other variables. In this paper we examine the use of a large publicly available collection of newswire articles as a benchmark for comparing authorship attribution methods. To demonstrate the value of having a benchmark, we experimentally compare several recent feature-based techniques for authorship attribution, and test how well these methods perform as the volume of data is increased. We show that the benchmark is able to clearly distinguish between different approaches, and that the scalability of the best methods based on using function words features is acceptable, with only moderate decline as the difficulty of the problem is increased.