Author identification in bengali literary works

  • Authors:
  • Suprabhat Das;Pabitra Mitra

  • Affiliations:
  • Department of Computer Science and Engineering, Indian Institute of Technology Kharagpur, West Bengal, India;Department of Computer Science and Engineering, Indian Institute of Technology Kharagpur, West Bengal, India

  • Venue:
  • PReMI'11 Proceedings of the 4th international conference on Pattern recognition and machine intelligence
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we study the problem of authorship identification in Bengali literary works. We considered three authors namely Rabindranath Tagore, Bankim Chandra Chattopadhyay and Sukanta Bhattacharyay. It was observed that simple unigram and bi-gram features along with vocabulary richness were rich enough to discriminate amongst these authors. Although results degraded slightly when training set size was considerably small. For larger training set, a classification accuracy of above 90% for unigram feature and almost 100% for bi-gram feature was achieved. Results could be improved further by using more sophisticated features.