Information awareness: a prospective technical assessment

Authors:
David Jensen;Matthew Rattigan;Hannah Blau
Affiliations:
University of Massachusetts, Amherst, MA;University of Massachusetts, Amherst, MA;University of Massachusetts, Amherst, MA
Venue:
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2003

Citing 7
Cited 6

Enhanced hypertext categorization using hyperlinks

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Robust Classification for Imprecise Environments

Machine Learning
Adaptive Fraud Detection

Data Mining and Knowledge Discovery
Linkage and Autocorrelation Cause Feature Selection Bias in Relational Learning

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
The Case against Accuracy Estimation for Comparing Induction Algorithms

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Industry: break detection systems

Handbook of data mining and knowledge discovery
Discriminative probabilistic models for relational data

UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence

Graph-based technologies for intelligence analysis

Communications of the ACM - Homeland security
Adversarial classification

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
The case for anomalous link detection

MRDM '05 Proceedings of the 4th international workshop on Multi-relational mining
Multi-Stage Classification

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
The case for anomalous link discovery

ACM SIGKDD Explorations Newsletter
On the efficacy of data mining for security applications

Proceedings of the ACM SIGKDD Workshop on CyberSecurity and Intelligence Informatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recent proposals to apply data mining systems to problems in law enforcement, national security, and fraud detection have attracted both media attention and technical critiques of their expected accuracy and impact on privacy. Unfortunately, the majority of technical critiques have been based on simplistic assumptions about data, classifiers, inference procedures, and the overall architecture of such systems. We consider these critiques in detail, and we construct a simulation model that more closely matches realistic systems. We show how both the accuracy and privacy impact of a hypothetical system could be substantially improved, and we discuss the necessary and sufficient conditions for this improvement to be achieved. This analysis is neither a defense nor a critique of any particular system concept. Rather, our model suggests alternative technical designs that could mitigate some concerns, but also raises more specific conditions that must be met for such systems to be both accurate and socially desirable.