Is data clustering in adversarial settings secure?

Authors:
Battista Biggio;Ignazio Pillai;Samuel Rota Bulò;Davide Ariu;Marcello Pelillo;Fabio Roli
Affiliations:
Università di Cagliari, Cagliari, Italy;Università di Cagliari, Cagliari, Italy;FBK-irst, Trento, Italy;Università di Cagliari, Cagliari, Italy;Università Ca' Foscari di Venezia, Venice, Italy;Università di Cagliari, Cagliari, Italy
Venue:
Proceedings of the 2013 ACM workshop on Artificial intelligence and security
Year:
2013

Citing 17
Cited 0

Algorithms for clustering data

Algorithms for clustering data
BIRCH: an efficient data clustering method for very large databases

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Honeypots: Tracking Hackers

Honeypots: Tracking Hackers
Computer Vision: A Modern Approach

Computer Vision: A Modern Approach
On Clustering Validation Techniques

Journal of Intelligent Information Systems
Can machine learning be secure?

ASIACCS '06 Proceedings of the 2006 ACM Symposium on Information, computer and communications security
Dominant Sets and Pairwise Clustering

IEEE Transactions on Pattern Analysis and Machine Intelligence
Adversarial Knowledge Discovery

IEEE Intelligent Systems
Clustering Stability: An Overview

Foundations and Trends® in Machine Learning
Adversarial Web Search

Foundations and Trends in Information Retrieval
Crowdroid: behavior-based malware detection system for Android

Proceedings of the 1st ACM workshop on Security and privacy in smartphones and mobile devices
Adversarial machine learning

Proceedings of the 4th ACM workshop on Security and artificial intelligence
Discriminative clustering for market segmentation

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Early Detection of Malicious Flux Networks via Large-Scale Passive DNS Traffic Analysis

IEEE Transactions on Dependable and Secure Computing
Scalable fine-grained behavioral clustering of HTTP-based malware

Computer Networks: The International Journal of Computer and Telecommunications Networking
Juxtapp: a scalable system for detecting code reuse among android applications

DIMVA'12 Proceedings of the 9th international conference on Detection of Intrusions and Malware, and Vulnerability Assessment
Static prediction games for adversarial learning problems

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Clustering algorithms have been increasingly adopted in security applications to spot dangerous or illicit activities. However, they have not been originally devised to deal with deliberate attack attempts that may aim to subvert the clustering process itself. Whether clustering can be safely adopted in such settings remains thus questionable. In this work we propose a general framework that allows one to identify potential attacks against clustering algorithms, and to evaluate their impact, by making specific assumptions on the adversary's goal, knowledge of the attacked system, and capabilities of manipulating the input data. We show that an attacker may significantly poison the whole clustering process by adding a relatively small percentage of attack samples to the input data, and that some attack samples may be obfuscated to be hidden within some existing clusters. We present a case study on single-linkage hierarchical clustering, and report experiments on clustering of malware samples and handwritten digits.