A statistical approach to mining customers' conversational data from social media

Authors:
D. Konopnicki;M. Shmueli-Scheuer;D. Cohen;B. Sznajder;J. Herzig;A. Raviv;N. Zwerling;H. Roitman;Y. Mass
Affiliations:
IBM Research Division, Haifa Research Laboratory, Haifa;IBM Research Division, Haifa Research Laboratory, Haifa, Israel;IBM Research Division, Haifa Research Laboratory, Haifa, Israel;IBM Research Division, Haifa Research Laboratory, Haifa, Israel;IBM Research Division, Haifa Research Laboratory, Haifa, Israel;IBM Research Division, Haifa Research Laboratory, Haifa, Israel;IBM Research Division, Haifa Research Laboratory, Haifa, Israel;IBM Research Division, Haifa Research Laboratory, Haifa, Israel;IBM Research Division, Haifa Research Laboratory, Haifa, Israel
Venue:
IBM Journal of Research and Development
Year:
2013

Citing 15
Cited 0

Self-Adaptive User Profiles for Large-Scale Data Delivery

ICDE '00 Proceedings of the 16th International Conference on Data Engineering
The Google Similarity Distance

IEEE Transactions on Knowledge and Data Engineering
Introduction to Information Retrieval

Introduction to Information Retrieval
Large-Scale Parallel Collaborative Filtering for the Netflix Prize

AAIM '08 Proceedings of the 4th international conference on Algorithmic Aspects in Information and Management
Large-scale behavioral targeting

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Enhancing cluster labeling using wikipedia

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Lexical Affinity Measure between Words

TSD '09 Proceedings of the 12th International Conference on Text, Speech and Dialogue
Extracting user profiles from large scale data

Proceedings of the 2010 Workshop on Massive Data Analytics on the Cloud
Recommending twitter users to follow using content and collaborative filtering approaches

Proceedings of the fourth ACM conference on Recommender systems
Unsupervised cleansing of noisy text

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Empirical study of topic modeling in Twitter

Proceedings of the First Workshop on Social Media Analytics
Tweets from Justin Bieber's heart: the dynamics of the location field in user profiles

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Democrats, republicans and starbucks afficionados: user classification in twitter

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Analyzing user modeling on twitter for personalized news recommendations

UMAP'11 Proceedings of the 19th international conference on User modeling, adaption, and personalization
Surfacing time-critical insights from social media

SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we present one possible way of analyzing social media conversional data in order to better understand customers. Ultimately, our goal is to analyze customer behavior as it is expressed in free-form conversations and extract from it commercially valuable information about the customer. In this study, we concentrate on using statistical techniques for analyzing this unstructured data at two levels: 1) at the level of the words used in the conversation and 2) by mapping those words to abstract concepts. The goal of such a statistical analysis is twofold. First, the statistically significant terms used by the users and the concepts associated with them provide insight on a user's interests that commercial services can use, for example, in order to target advertisements. In addition, knowing the evolution of a customer's interests and hobbies can be exploited commercially by retailers, media and entertainment companies, telecommunications companies, and more. In this paper, we describe a general framework for the analysis of social media data and, in turn, the application of the framework to the statistical analysis of the language of tweets.