Part-of-speech tagging for Twitter: annotation, features, and experiments

  • Authors:
  • Kevin Gimpel;Nathan Schneider;Brendan O'Connor;Dipanjan Das;Daniel Mills;Jacob Eisenstein;Michael Heilman;Dani Yogatama;Jeffrey Flanigan;Noah A. Smith

  • Affiliations:
  • Carnegie Mellon Univeristy, Pittsburgh, PA;Carnegie Mellon Univeristy, Pittsburgh, PA;Carnegie Mellon Univeristy, Pittsburgh, PA;Carnegie Mellon Univeristy, Pittsburgh, PA;Carnegie Mellon Univeristy, Pittsburgh, PA;Carnegie Mellon Univeristy, Pittsburgh, PA;Carnegie Mellon Univeristy, Pittsburgh, PA;Carnegie Mellon Univeristy, Pittsburgh, PA;Carnegie Mellon Univeristy, Pittsburgh, PA;Carnegie Mellon Univeristy, Pittsburgh, PA

  • Venue:
  • HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

We address the problem of part-of-speech tagging for English data from the popular micro-blogging service Twitter. We develop a tagset, annotate data, develop features, and report tagging results nearing 90% accuracy. The data and tools have been made available to the research community with the goal of enabling richer text analysis of Twitter and related social media data sets.