A statistical approach to mining customers' conversational data from social media

  • Authors:
  • D. Konopnicki;M. Shmueli-Scheuer;D. Cohen;B. Sznajder;J. Herzig;A. Raviv;N. Zwerling;H. Roitman;Y. Mass

  • Affiliations:
  • IBM Research Division, Haifa Research Laboratory, Haifa;IBM Research Division, Haifa Research Laboratory, Haifa, Israel;IBM Research Division, Haifa Research Laboratory, Haifa, Israel;IBM Research Division, Haifa Research Laboratory, Haifa, Israel;IBM Research Division, Haifa Research Laboratory, Haifa, Israel;IBM Research Division, Haifa Research Laboratory, Haifa, Israel;IBM Research Division, Haifa Research Laboratory, Haifa, Israel;IBM Research Division, Haifa Research Laboratory, Haifa, Israel;IBM Research Division, Haifa Research Laboratory, Haifa, Israel

  • Venue:
  • IBM Journal of Research and Development
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we present one possible way of analyzing social media conversional data in order to better understand customers. Ultimately, our goal is to analyze customer behavior as it is expressed in free-form conversations and extract from it commercially valuable information about the customer. In this study, we concentrate on using statistical techniques for analyzing this unstructured data at two levels: 1) at the level of the words used in the conversation and 2) by mapping those words to abstract concepts. The goal of such a statistical analysis is twofold. First, the statistically significant terms used by the users and the concepts associated with them provide insight on a user's interests that commercial services can use, for example, in order to target advertisements. In addition, knowing the evolution of a customer's interests and hobbies can be exploited commercially by retailers, media and entertainment companies, telecommunications companies, and more. In this paper, we describe a general framework for the analysis of social media data and, in turn, the application of the framework to the statistical analysis of the language of tweets.