Using google n-grams to expand word-emotion association lexicon

  • Authors:
  • Jessica Perrie;Aminul Islam;Evangelos Milios;Vlado Keselj

  • Affiliations:
  • Faculty of Computer Science, Dalhousie University, Halifax, NS, Canada;Faculty of Computer Science, Dalhousie University, Halifax, NS, Canada;Faculty of Computer Science, Dalhousie University, Halifax, NS, Canada;Faculty of Computer Science, Dalhousie University, Halifax, NS, Canada

  • Venue:
  • CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume 2
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present an approach to automatically generate a word-emotion lexicon based on a smaller human-annotated lexicon. To identify associated feelings of a target word (a word being considered for inclusion in the lexicon), our proposed approach uses the frequencies, counts or unique words around it within the trigrams from the Google n-gram corpus. The approach was tuned using as training lexicon, a subset of the National Research Council of Canada (NRC) word-emotion association lexicon, and applied to generate new lexicons of 18,000 words. We present six different lexicons generated by different ways using the frequencies, counts, or unique words extracted from the n-gram corpus. Finally, we evaluate our approach by testing each generated lexicon against a human-annotated lexicon to classify feelings from affective text, and demonstrate that the larger generated lexicons perform better than the human-annotated one.