An approach for temporal analysis of email data based on segmentation

  • Authors:
  • Parvathi Chundi;Mahadevan Subramaniam;Dileep K. Vasireddy

  • Affiliations:
  • Computer Science Department, Univ. of Nebraska at Omaha, Omaha, NE 68182, United States;Computer Science Department, Univ. of Nebraska at Omaha, Omaha, NE 68182, United States;Computer Science Department, Univ. of Nebraska at Omaha, Omaha, NE 68182, United States

  • Venue:
  • Data & Knowledge Engineering
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many kinds of information are hidden in email data, such as the information being exchanged, the time of exchange, and the user IDs participating in the exchange. Analyzing the email data can reveal valuable information about the social networks of a single user or multiple users, the topics being discussed, and so on. In this paper, we describe a novel approach for temporally analyzing the communication patterns embedded in email data based on time series segmentation. The approach computes egocentric communication patterns of a single user, as well as sociocentric communication patterns involving multiple users. Time series segmentation is used to uncover patterns that may span multiple time points and to study how these patterns change over time. To find egocentric patterns, the email communication of a user is represented as an item-set time series. An optimal segmentation of the item-set time series is constructed, from which patterns are extracted. To find sociocentric patterns, the email data is represented as an item-setgroup time series. Patterns involving multiple users are then extracted from an optimal segmentation of the item-setgroup time series. The proposed approach is applied to the Enron email data set, which produced very promising results.