Automatic detection and correction of web application vulnerabilities using data mining to predict false positives

Authors:
Ibéria Medeiros;Nuno F. Neves;Miguel Correia
Affiliations:
University of Lisboa, Faculty of Sciences, Lisboa, Portugal;University of Lisboa, Faculty of Sciences, Lisboa, Portugal;University of Lisboa, Instituto Superior Técnico, Lisboa, Portugal
Venue:
Proceedings of the 23rd international conference on World wide web
Year:
2014

Citing 33
Cited 0

Undecidability of static analysis

ACM Letters on Programming Languages and Systems (LOPLAS)
Exploring the relationship between design measures and software quality in object-oriented systems

Journal of Systems and Software
Lattice-Based Access Control Models

Computer
Improving Security Using Extensible Lightweight Static Analysis

IEEE Software
Web application security assessment by fault injection and behavior monitoring

WWW '03 Proceedings of the 12th international conference on World Wide Web
Securing web application code by static analysis and runtime protection

Proceedings of the 13th international conference on World Wide Web
AMNESIA: analysis and monitoring for NEutralizing SQL-injection attacks

Proceedings of the 20th IEEE/ACM international Conference on Automated software engineering
Using parse tree validation to prevent SQL injection attacks

SEM '05 Proceedings of the 5th international workshop on Software engineering and middleware
Precise alias analysis for static detection of web application vulnerabilities

Proceedings of the 2006 workshop on Programming languages and analysis for security
Statistical Comparisons of Classifiers over Multiple Data Sets

The Journal of Machine Learning Research
Sound and precise analysis of web applications for injection vulnerabilities

Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
Detecting format string vulnerabilities with type qualifiers

SSYM'01 Proceedings of the 10th conference on USENIX Security Symposium - Volume 10
Automated Protection of PHP Applications Against SQL-injection Attacks

CSMR '07 Proceedings of the 11th European Conference on Software Maintenance and Reengineering
SigFree: a signature-free buffer overflow attack blocker

USENIX-SS'06 Proceedings of the 15th conference on USENIX Security Symposium - Volume 15
CANDID: preventing sql injection attacks using dynamic candidate evaluations

Proceedings of the 14th ACM conference on Computer and communications security
Predicting vulnerable software components

Proceedings of the 14th ACM conference on Computer and communications security
WASP: Protecting Web Applications Using Positive Tainting and Syntax-Aware Evaluation

IEEE Transactions on Software Engineering
Benchmarking Classification Models for Software Defect Prediction: A Proposed Framework and Novel Findings

IEEE Transactions on Software Engineering
A systematic and comprehensive investigation of methods to build and evaluate fault prediction models

Journal of Systems and Software
Security of open source web applications

ESEM '09 Proceedings of the 2009 3rd International Symposium on Empirical Software Engineering and Measurement
Language Implementation Patterns: Create Your Own Domain-Specific and General Programming Languages

Language Implementation Patterns: Create Your Own Domain-Specific and General Programming Languages
Vulnerability Discovery with Attack Injection

IEEE Transactions on Software Engineering
Outside the Closed World: On Using Machine Learning for Network Intrusion Detection

SP '10 Proceedings of the 2010 IEEE Symposium on Security and Privacy
Data Mining: Practical Machine Learning Tools and Techniques

Data Mining: Practical Machine Learning Tools and Techniques
PHP Aspis: using partial taint tracking to protect against injection attacks

WebApps'11 Proceedings of the 2nd USENIX conference on Web application development
Evaluating Complexity, Code Churn, and Developer Activity Metrics as Indicators of Software Vulnerabilities

IEEE Transactions on Software Engineering
Defending against injection attacks through context-sensitive string evaluation

RAID'05 Proceedings of the 8th international conference on Recent Advances in Intrusion Detection
SAFERPHP: finding semantic vulnerabilities in PHP applications

Proceedings of the ACM SIGPLAN 6th Workshop on Programming Languages and Analysis for Security
Fast black-box testing of system recovery code

Proceedings of the 7th ACM european conference on Computer Systems
Language-based information-flow security

IEEE Journal on Selected Areas in Communications
Mining input sanitization patterns for predicting SQL injection and cross site scripting vulnerabilities

Proceedings of the 34th International Conference on Software Engineering
Predicting common web application vulnerabilities from input validation and sanitization code patterns

Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering
Mining SQL injection and cross site scripting vulnerabilities using hybrid program analysis

Proceedings of the 2013 International Conference on Software Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Web application security is an important problem in today's internet. A major cause of this status is that many programmers do not have adequate knowledge about secure coding, so they leave applications with vulnerabilities. An approach to solve this problem is to use source code static analysis to find these bugs, but these tools are known to report many false positives that make hard the task of correcting the application. This paper explores the use of a hybrid of methods to detect vulnerabilities with less false positives. After an initial step that uses taint analysis to flag candidate vulnerabilities, our approach uses data mining to predict the existence of false positives. This approach reaches a trade-off between two apparently opposite approaches: humans coding the knowledge about vulnerabilities (for taint analysis) versus automatically obtaining that knowledge (with machine learning, for data mining). Given this more precise form of detection, we do automatic code correction by inserting fixes in the source code. The approach was implemented in the WAP tool and an experimental evaluation was performed with a large set of open source PHP applications.