Syntactic dependency-based n-grams: more evidence of usefulness in classification

  • Authors:
  • Grigori Sidorov;Francisco Velasquez;Efstathios Stamatatos;Alexander Gelbukh;Liliana Chanona-Hernández

  • Affiliations:
  • Center for Computing Research (CIC), Instituto Politécnico Nacional (IPN), Mexico City, Mexico;Center for Computing Research (CIC), Instituto Politécnico Nacional (IPN), Mexico City, Mexico;University of the Aegean, Greece;Center for Computing Research (CIC), Instituto Politécnico Nacional (IPN), Mexico City, Mexico;ESIME, Instituto Politécnico Nacional (IPN), Mexico City, Mexico

  • Venue:
  • CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

The paper introduces and discusses a concept of syntactic n-grams (sn-grams) that can be applied instead of traditional n-grams in many NLP tasks. Sn-grams are constructed by following paths in syntactic trees, so sn-grams allow bringing syntactic knowledge into machine learning methods. Still, previous parsing is necessary for their construction. We applied sn-grams in the task of authorship attribution for corpora of three and seven authors with very promising results.