Detecting data misuse by applying context-based data linkage

Authors:
Ma'ayan Gafny;Asaf Shabtai;Lior Rokach;Yuval Elovici
Affiliations:
Ben Gurion University, Beer-Sheva, Israel;Ben Gurion University, Beer-Sheva, Israel;Ben Gurion University, Beer-Sheva, Israel;Ben Gurion University, Beer-Sheva, Israel
Venue:
Proceedings of the 2010 ACM workshop on Insider threats
Year:
2010

Citing 14
Cited 2

C4.5: programs for machine learning

C4.5: programs for machine learning
Integration of heterogeneous databases without common domains using queries based on textual similarity

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
DEMIDS: a misuse detection system for database systems

Integrity and internal control information systems
Efficient clustering of high-dimensional data sets with application to reference matching

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Record linkage: making maximum use of the discriminating power of identifying information

Communications of the ACM
Learning Decision Trees Using the Area Under the ROC Curve

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
A Novel Intrusion Detection System Model for Securing Web-based Database Systems

COMPSAC '01 Proceedings of the 25th International Computer Software and Applications Conference on Invigorating Software Development
Learning Fingerprints for a Database Intrusion Detection System

ESORICS '02 Proceedings of the 7th European Symposium on Research in Computer Security
Interactive deduplication using active learning

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
An Extensible Framework for Data Cleaning

ICDE '00 Proceedings of the 16th International Conference on Data Engineering
Online detection of malicious data access using DBMS auditing

Proceedings of the 2008 ACM symposium on Applied computing
Detecting anomalous access patterns in relational databases

The VLDB Journal — The International Journal on Very Large Data Bases
A comprehensive approach to anomaly detection in relational databases

DBSec'05 Proceedings of the 19th annual IFIP WG 11.3 working conference on Data and Applications Security
A learning-based approach to the detection of SQL attacks

DIMVA'05 Proceedings of the Second international conference on Detection of Intrusions and Malware, and Vulnerability Assessment

Poster: applying unsupervised context-based analysis for detecting unauthorized data disclosure

Proceedings of the 18th ACM conference on Computer and communications security
CoBAn: A context based model for data leakage prevention

Information Sciences: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Detecting data leakage/misuse poses a great challenge for organizations. Whether caused by malicious intent or an inadvertent mistake, data leakage/misuse can diminish a company's brand, reduce shareholder value, and damage the company's goodwill and reputation. This challenge is intensified when trying to detect and/or prevent data leakage/misuse performed by an insider with legitimate permissions to access the organization's systems and its critical data. In this paper we propose a new approach for identifying suspicious insiders who can access data stored in a database via an application. In the proposed method suspicious access to sensitive data is detected by analyzing the result-sets sent to the user following a request that the user submitted. Result-sets are analyzed within the instantaneous context in which the request was submitted. From the analysis of the result-set and the context we derive a "level of anomality". If the derived level is above a predefined threshold, an alert can be sent to the security officer. The proposed method applies data-linkage techniques in order to link the contextual features and the result-sets. Machine learning algorithms are then employed for generating a behavioral model during a learning phase. The behavioral model encapsulates knowledge on the behavior of a user; i.e., the characteristics of the result-sets of legitimate or malicious requests. This behavioral model is used for identifying malicious requests based on their abnormality. An evaluation with sanitized data shows the usefulness of the proposed method in detecting data misuse.