Grammatical dependency-based relations for term weighting in text classification

  • Authors:
  • Dat Huynh;Dat Tran;Wanli Ma;Dharmendra Sharma

  • Affiliations:
  • Faculty of Information Sciences and Engineering, University of Canberra, Australia;Faculty of Information Sciences and Engineering, University of Canberra, Australia;Faculty of Information Sciences and Engineering, University of Canberra, Australia;Faculty of Information Sciences and Engineering, University of Canberra, Australia

  • Venue:
  • PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part I
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Term frequency and term co-occurrence are currently used to estimate term weightings in a document. However these methods do not employ relations based on grammatical dependency among terms to measure dependency between word features. In this paper, we propose a new approach that employs grammatical relations to estimate weightings of terms in a text document and present how to apply the term weighting scheme to text classification. A graph model is used to encode the extracted relations. A graph centrality algorithm is then applied to calculate scores that represent significance values of the terms in the document context. Experiments performed on many corpora with SVM classifier show that the proposed term weighting approach outperforms those based on term frequency and term co-occurrence.