The program dependence graph and its use in optimization
ACM Transactions on Programming Languages and Systems (TOPLAS)
Exploring the relationship between design measures and software quality in object-oriented systems
Journal of Systems and Software
Pixy: A Static Analysis Tool for Detecting Web Application Vulnerabilities (Short Paper)
SP '06 Proceedings of the 2006 IEEE Symposium on Security and Privacy
Using Historical In-Process and Product Metrics for Early Estimation of Software Failures
ISSRE '06 Proceedings of the 17th International Symposium on Software Reliability Engineering
Data Mining
Statistical Comparisons of Classifiers over Multiple Data Sets
The Journal of Machine Learning Research
Sound and precise analysis of web applications for injection vulnerabilities
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
Static detection of security vulnerabilities in scripting languages
USENIX-SS'06 Proceedings of the 15th conference on USENIX Security Symposium - Volume 15
Predicting vulnerable software components
Proceedings of the 14th ACM conference on Computer and communications security
Saner: Composing Static and Dynamic Analysis to Validate Sanitization in Web Applications
SP '08 Proceedings of the 2008 IEEE Symposium on Security and Privacy
IEEE Transactions on Software Engineering
Automatic generation of XSS and SQL injection attacks with goal-directed model checking
SS'08 Proceedings of the 17th conference on Security symposium
Web Application Vulnerabilities: Detect, Exploit, Prevent
Web Application Vulnerabilities: Detect, Exploit, Prevent
Automatic creation of SQL Injection and cross-site scripting attacks
ICSE '09 Proceedings of the 31st International Conference on Software Engineering
Journal of Systems and Software
Security of open source web applications
ESEM '09 Proceedings of the 2009 3rd International Symposium on Empirical Software Engineering and Measurement
RSFDGrC '09 Proceedings of the 12th International Conference on Rough Sets, Fuzzy Sets, Data Mining and Granular Computing
Defect prediction from static code features: current results, limitations, new approaches
Automated Software Engineering
IEEE Transactions on Software Engineering
Automated removal of cross site scripting vulnerabilities in web applications
Information and Software Technology
Proceedings of the 34th International Conference on Software Engineering
Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering
Information and Software Technology
Proceedings of the 23rd international conference on World wide web
Hi-index | 0.00 |
In previous work, we proposed a set of static attributes that characterize input validation and input sanitization code patterns. We showed that some of the proposed static attributes are significant predictors of SQL injection and cross site scripting vulnerabilities. Static attributes have the advantage of reflecting general properties of a program. Yet, dynamic attributes collected from execution traces may reflect more specific code characteristics that are complementary to static attributes. Hence, to improve our initial work, in this paper, we propose the use of dynamic attributes to complement static attributes in vulnerability prediction. Furthermore, since existing work relies on supervised learning, it is dependent on the availability of training data labeled with known vulnerabilities. This paper presents prediction models that are based on both classification and clustering in order to predict vulnerabilities, working in the presence or absence of labeled training data, respectively. In our experiments across six applications, our new supervised vulnerability predictors based on hybrid (static and dynamic) attributes achieved, on average, 90% recall and 85% precision, that is a sharp increase in recall when compared to static analysis-based predictions. Though not nearly as accurate, our unsupervised predictors based on clustering achieved, on average, 76% recall and 39% precision, thus suggesting they can be useful in the absence of labeled training data.