Analyzing social networks in e-mail with rich syntactic features

  • Authors:
  • Paul Thompson;Wei Zhang

  • Affiliations:
  • Department of Computer Science, Dartmouth College, Hanover, NH;Department of Computer Science, Dartmouth College, Hanover, NH

  • Venue:
  • ISI'06 Proceedings of the 4th IEEE international conference on Intelligence and Security Informatics
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Social network analysis has emerged as a key technique in countering crime and terrorism. The Enron e-mail dataset, originally made public and posted to the web by the Federal Energy Regulatory Commission during its investigation, consists of around half a million e-mails among several thousand individuals. It is valuable in the sense that it is perhaps the only real e-mail dataset that is accessible to the research community. This paper presents preliminary results of an analysis of the Enron e-mail dataset based on a variation of the Author-Recipient-Topic (ART) model [1]. The GR-ART model described here uses grammatical relations as features, rather than bags of words. It is our hypothesis that using grammatical relations as features will provide a more useful model of authors, topics, and recipients than will the use of words alone. This research complements earlier research by one of the authors in applying information extraction techniques to cross-document named entity co-reference [2].