Revisiting the predictability of language: response completion in social media

  • Authors:
  • Bo Pang;Sujith Ravi

  • Affiliations:
  • Yahoo! Research, Santa Clara, CA;Yahoo! Research, Santa Clara, CA

  • Venue:
  • EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

The question "how predictable is English?" has long fascinated researchers. While prior work has focused on formal English typically used in news articles, we turn to texts generated by users in online settings that are more informal in nature. We are motivated by a novel application scenario: given the difficulty of typing on mobile devices, can we help reduce typing effort with message completion, especially in conversational settings? We propose a method for automatic response completion. Our approach models both the language used in responses and the specific context provided by the original message. Our experimental results on a large-scale dataset show that both components help reduce typing effort. We also perform an information-theoretic study in this setting and examine the entropy of user-generated content, especially in conversational scenarios, to better understand predictability of user generated English.