Identifying a critical threat to privacy through automatic image classification

Authors:
David Lorenzi;Jaideep Vaidya
Affiliations:
Rutgers University, Newark, NJ, USA;Rutgers University, Newark, NJ, USA
Venue:
Proceedings of the first ACM conference on Data and application security and privacy
Year:
2011

Citing 10
Cited 0

Gabor Analysis and Algorithms: Theory and Applications

Gabor Analysis and Algorithms: Theory and Applications
Image classification using hybrid neural networks

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Efficient edge detection and object segmentation using Gabor filters

ACM-SE 42 Proceedings of the 42nd annual Southeast regional conference
On Intelligence

On Intelligence
A Coarse-to-Fine Strategy for Multiclass Shape Detection

IEEE Transactions on Pattern Analysis and Machine Intelligence
Content-based image retrieval: approaches and trends of the new age

Proceedings of the 7th ACM SIGMM international workshop on Multimedia information retrieval
Model-Based Hand Tracking Using a Hierarchical Bayesian Filter

IEEE Transactions on Pattern Analysis and Machine Intelligence
A study in using neural networks for anomaly and misuse detection

SSYM'99 Proceedings of the 8th conference on USENIX Security Symposium - Volume 8
Content-based image retrieval using hierarchical temporal memory

MM '08 Proceedings of the 16th ACM international conference on Multimedia
Large-scale neural systems for vision and cognition

IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

Image classification, in general, is considered a hard problem, though it is necessary for many useful applications such as automatic target recognition. Indeed, no general methods exist that can work in varying scenarios and still achieve good performance across the board. In this paper, we actually identify a very interesting problem, where image classification is dangerously easy. We look at the problem of image classification, in the specific context of accurately classifying images containing highly sensitive data such as drivers licenses, credit cards and passports. Our key contribution is to build a Hierarchical Temporal Memory (HTM) network that is able to classify many sensitive images with over 90% accuracy, and use this to develop a system to automatically derive and transcribe sensitive information from image data. Our system classifies images into two groups -- sensitive and non-sensitive. The group of sensitive images can then be further analyzed. This is a real world security issue that could easily lead to privacy problems such as identity theft, since scans of passports and drivers licenses are routinely emailed or kept in digital form, and many local documents are left unencrypted. Essentially, an attacker can use data mining and machine learning techniques very effectively to breach individual privacy. Thus, our main contribution is to demonstrate the efficacy of image classification for deriving sensitive information, which could also serve as a guide for other interesting applications such as document detection and analysis. Thus, it also serves as a warning against leaving data unencrypted and again proves that security through obscurity is simply not enough.