Mining SQL injection and cross site scripting vulnerabilities using hybrid program analysis

Authors:
Lwin Khin Shar;Hee Beng Kuan Tan;Lionel C. Briand
Affiliations:
Nanyang Technological University, Singapore;Nanyang Technological University, Singapore;University of Luxembourg, Luxembourg
Venue:
Proceedings of the 2013 International Conference on Software Engineering
Year:
2013

Citing 22
Cited 2

The program dependence graph and its use in optimization

ACM Transactions on Programming Languages and Systems (TOPLAS)
Exploring the relationship between design measures and software quality in object-oriented systems

Journal of Systems and Software
Pixy: A Static Analysis Tool for Detecting Web Application Vulnerabilities (Short Paper)

SP '06 Proceedings of the 2006 IEEE Symposium on Security and Privacy
Using Historical In-Process and Product Metrics for Early Estimation of Software Failures

ISSRE '06 Proceedings of the 17th International Symposium on Software Reliability Engineering
Data Mining

Data Mining
Statistical Comparisons of Classifiers over Multiple Data Sets

The Journal of Machine Learning Research
Sound and precise analysis of web applications for injection vulnerabilities

Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
Static detection of security vulnerabilities in scripting languages

USENIX-SS'06 Proceedings of the 15th conference on USENIX Security Symposium - Volume 15
Predicting vulnerable software components

Proceedings of the 14th ACM conference on Computer and communications security
Saner: Composing Static and Dynamic Analysis to Validate Sanitization in Web Applications

SP '08 Proceedings of the 2008 IEEE Symposium on Security and Privacy
Benchmarking Classification Models for Software Defect Prediction: A Proposed Framework and Novel Findings

IEEE Transactions on Software Engineering
Automatic generation of XSS and SQL injection attacks with goal-directed model checking

SS'08 Proceedings of the 17th conference on Security symposium
Web Application Vulnerabilities: Detect, Exploit, Prevent

Web Application Vulnerabilities: Detect, Exploit, Prevent
Automatic creation of SQL Injection and cross-site scripting attacks

ICSE '09 Proceedings of the 31st International Conference on Software Engineering
A systematic and comprehensive investigation of methods to build and evaluate fault prediction models

Journal of Systems and Software
Security of open source web applications

ESEM '09 Proceedings of the 2009 3rd International Symposium on Empirical Software Engineering and Measurement
Combining Naive-Bayesian Classifier and Genetic Clustering for Effective Anomaly Based Intrusion Detection

RSFDGrC '09 Proceedings of the 12th International Conference on Rough Sets, Fuzzy Sets, Data Mining and Granular Computing
Defect prediction from static code features: current results, limitations, new approaches

Automated Software Engineering
Evaluating Complexity, Code Churn, and Developer Activity Metrics as Indicators of Software Vulnerabilities

IEEE Transactions on Software Engineering
Automated removal of cross site scripting vulnerabilities in web applications

Information and Software Technology
Mining input sanitization patterns for predicting SQL injection and cross site scripting vulnerabilities

Proceedings of the 34th International Conference on Software Engineering
Predicting common web application vulnerabilities from input validation and sanitization code patterns

Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering

Predicting SQL injection and cross site scripting vulnerabilities through mining input sanitization patterns

Information and Software Technology
Automatic detection and correction of web application vulnerabilities using data mining to predict false positives

Proceedings of the 23rd international conference on World wide web

Quantified Score

Hi-index	0.00

Visualization

Abstract

In previous work, we proposed a set of static attributes that characterize input validation and input sanitization code patterns. We showed that some of the proposed static attributes are significant predictors of SQL injection and cross site scripting vulnerabilities. Static attributes have the advantage of reflecting general properties of a program. Yet, dynamic attributes collected from execution traces may reflect more specific code characteristics that are complementary to static attributes. Hence, to improve our initial work, in this paper, we propose the use of dynamic attributes to complement static attributes in vulnerability prediction. Furthermore, since existing work relies on supervised learning, it is dependent on the availability of training data labeled with known vulnerabilities. This paper presents prediction models that are based on both classification and clustering in order to predict vulnerabilities, working in the presence or absence of labeled training data, respectively. In our experiments across six applications, our new supervised vulnerability predictors based on hybrid (static and dynamic) attributes achieved, on average, 90% recall and 85% precision, that is a sharp increase in recall when compared to static analysis-based predictions. Though not nearly as accurate, our unsupervised predictors based on clustering achieved, on average, 76% recall and 39% precision, thus suggesting they can be useful in the absence of labeled training data.