Data compression using dynamic Markov modelling
The Computer Journal
An introduction to Kolmogorov complexity and its applications (2nd ed.)
An introduction to Kolmogorov complexity and its applications (2nd ed.)
Joint Matrix Universal Coding of Sequences of Independent Symbols
Problems of Information Transmission
Zipping Out Relevant Information
Computing in Science and Engineering
A repetition based measure for verification of text collections and for text categorization
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Experimental investigation of forecasting methods based on data compression algorithms
Problems of Information Transmission
Foundations and Trends in Information Retrieval
Sublinear Algorithms for Approximating String Compressibility
APPROX '07/RANDOM '07 Proceedings of the 10th International Workshop on Approximation and the 11th International Workshop on Randomization, and Combinatorial Optimization. Algorithms and Techniques
A survey of modern authorship attribution methods
Journal of the American Society for Information Science and Technology
Forensic Authorship Attribution Using Compression Distances to Prototypes
IWCF '09 Proceedings of the 3rd International Workshop on Computational Forensics
Capturing expression using linguistic information
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 3
Author attribution of Turkish texts by feature mining
ICIC'07 Proceedings of the intelligent computing 3rd international conference on Advanced intelligent computing theories and applications
Automatic authorship attribution for texts in croatian language using combinations of features
KES'10 Proceedings of the 14th international conference on Knowledge-based and intelligent information and engineering systems: Part II
A comparative study of language models for book and author recognition
IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
On compression-based text classification
ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research
Authorship attribution of texts: a review
General Theory of Information Transfer and Combinatorics
Statistical recognition of a set of patterns using novel probability neural network
ANNPR'12 Proceedings of the 5th INNS IAPR TC 3 GIRPR conference on Artificial Neural Networks in Pattern Recognition
Legal documents categorization by compression
Proceedings of the Fourteenth International Conference on Artificial Intelligence and Law
Hi-index | 0.00 |
Markov chains are used as a formal mathematical model for sequences of elements of a text. This model is applied for authorship attribution of texts. As elements of a text, we consider sequences of letters or sequences of grammatical classes of words. It turns out that the frequencies of occurrences of letter pairs and pairs of grammatical classes in a Russian text are rather stable characteristics of an author and, apparently, they could be used in disputed authorship attribution. A comparison of results for various modifications of the method using both letters and grammatical classes is given. Experimental research involves 385 texts of 82 writers. In the Appendix, the research of D.V. Khmelev is described, where data compression algorithms are applied to authorship attribution.