Machine Learning
Authorship attribution with thousands of candidate authors
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Searching with style: authorship attribution in classic literature
ACSC '07 Proceedings of the thirtieth Australasian conference on Computer science - Volume 62
Author identification: Using text sampling to handle the class imbalance problem
Information Processing and Management: an International Journal
Introduction to Information Retrieval
Introduction to Information Retrieval
A survey of modern authorship attribution methods
Journal of the American Society for Information Science and Technology
Hi-index | 0.00 |
In this paper, we study the problem of authorship identification in Bengali literary works. We considered three authors namely Rabindranath Tagore, Bankim Chandra Chattopadhyay and Sukanta Bhattacharyay. It was observed that simple unigram and bi-gram features along with vocabulary richness were rich enough to discriminate amongst these authors. Although results degraded slightly when training set size was considerably small. For larger training set, a classification accuracy of above 90% for unigram feature and almost 100% for bi-gram feature was achieved. Results could be improved further by using more sophisticated features.