Detecting insider threats in a real corporate database of computer usage activity

Authors:
Ted E. Senator;Henry G. Goldberg;Alex Memory;William T. Young;Brad Rees;Robert Pierce;Daniel Huang;Matthew Reardon;David A. Bader;Edmond Chow;Irfan Essa;Joshua Jones;Vinay Bettadapura;Duen Horng Chau;Oded Green;Oguz Kaya;Anita Zakrzewska;Erica Briscoe;Rudolph IV L. Mappus;Robert McColl;Lora Weiss;Thomas G. Dietterich;Alan Fern;Weng--Keen Wong;Shubhomoy Das;Andrew Emmott;Jed Irvine;Jay-Yoon Lee;Danai Koutra;Christos Faloutsos;Daniel Corkill;Lisa Friedland;Amanda Gentzel;David Jensen
Affiliations:
SAIC, Arlington, VA, USA;SAIC, Arlington, VA, USA;SAIC, Arlington, VA, USA;SAIC, Arlington, VA, USA;SAIC, Arlington, VA, USA;SAIC, Arlington, VA, USA;SAIC, Arlington, VA, USA;SAIC, Arlington, VA, USA;Georgia Institute of Technology, Atlanta, GA, USA;Georgia Institute of Technology, Atlanta, GA, USA;Georgia Institute of Technology, Atlanta, GA, USA;Georgia Institute of Technology, Atlanta, GA, USA;Georgia Institute of Technology, Atlanta, GA, USA;Georgia Institute of Technology, Atlanta, GA, USA;Georgia Institute of Technology, Atlanta, GA, USA;Georgia Institute of Technology, Atlanta, GA, USA;Georgia Institute of Technology, Atlanta, GA, USA;Georgia Institute of Technology, Atlanta, GA, USA;Georgia Institute of Technology, Atlanta, GA, USA;Georgia Institute of Technology, Atlanta, GA, USA;Georgia Institute of Technology, Atlanta, GA, USA;Oregon State University, Corvallis, OR, USA;Oregon State University, Corvallis, OR, USA;Oregon State University, Corvallis, OR, USA;Oregon State University, Corvallis, OR, USA;Oregon State University, Corvallis, OR, USA;Oregon State University, Corvallis, OR, USA;Carnegie Mellon University, Pittsburgh, PA, USA;Carnegie Mellon University, Pittsburgh, PA, USA;Carnegie Mellon University, Pittsburgh, PA, USA;University of Massachusetts, Amherst, MA, USA;University of Massachusetts, Amherst, MA, USA;University of Massachusetts, Amherst, MA, USA;University of Massachusetts, Amherst, MA, USA
Venue:
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2013

Citing 15
Cited 0

A view of the EM algorithm that justifies incremental, sparse, and other variants

Learning in graphical models
Ensemble Methods in Machine Learning

MCS '00 Proceedings of the First International Workshop on Multiple Classifier Systems
Multi-Stage Classification

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Quantile Regression Forests

The Journal of Machine Learning Research
One-Class Classification by Combining Density and Class Probability Estimation

ECML PKDD '08 Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases - Part I
A novel sequence representation for unsupervised analysis of human activities

Artificial Intelligence
Apolo: making sense of large network data by combining rich user interaction and machine learning

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
The CERT Guide to Insider Threats: How to Prevent, Detect, and Respond to Information Technology Crimes

The CERT Guide to Insider Threats: How to Prevent, Detect, and Respond to Information Technology Crimes
Scalable Multi-threaded Community Detection in Social Networks

IPDPSW '12 Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum
Ensemble Methods: Foundations and Algorithms

Ensemble Methods: Foundations and Algorithms
A Fast Algorithm for Streaming Betweenness Centrality

SOCIALCOM-PASSAT '12 Proceedings of the 2012 ASE/IEEE International Conference on Social Computing and 2012 ASE/IEEE International Conference on Privacy, Security, Risk and Trust
Multithreaded Community Monitoring for Massive Streaming Graph Data

IPDPSW '13 Proceedings of the 2013 IEEE 27th International Symposium on Parallel and Distributed Processing Workshops and PhD Forum
Use of Domain Knowledge to Detect Insider Threats in Computer Activities

SPW '13 Proceedings of the 2013 IEEE Security and Privacy Workshops
Bridging the Gap: A Pragmatic Approach to Generating Insider Threat Data

SPW '13 Proceedings of the 2013 IEEE Security and Privacy Workshops
Augmenting Bag-of-Words: Data-Driven Discovery of Temporal and Structural Information for Activity Recognition

CVPR '13 Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper reports on methods and results of an applied research project by a team consisting of SAIC and four universities to develop, integrate, and evaluate new approaches to detect the weak signals characteristic of insider threats on organizations' information systems. Our system combines structural and semantic information from a real corporate database of monitored activity on their users' computers to detect independently developed red team inserts of malicious insider activities. We have developed and applied multiple algorithms for anomaly detection based on suspected scenarios of malicious insider behavior, indicators of unusual activities, high-dimensional statistical patterns, temporal sequences, and normal graph evolution. Algorithms and representations for dynamic graph processing provide the ability to scale as needed for enterprise-level deployments on real-time data streams. We have also developed a visual language for specifying combinations of features, baselines, peer groups, time periods, and algorithms to detect anomalies suggestive of instances of insider threat behavior. We defined over 100 data features in seven categories based on approximately 5.5 million actions per day from approximately 5,500 users. We have achieved area under the ROC curve values of up to 0.979 and lift values of 65 on the top 50 user-days identified on two months of real data.