Active learning for automatic classification of software behavior

Authors:
James F. Bowring;James M. Rehg;Mary Jean Harrold
Affiliations:
Georgia Institute of Technology, Atlanta, Georgia;Georgia Institute of Technology, Atlanta, Georgia;Georgia Institute of Technology, Atlanta, Georgia
Venue:
ISSTA '04 Proceedings of the 2004 ACM SIGSOFT international symposium on Software testing and analysis
Year:
2004

Citing 20
Cited 40

Markov analysis of software specifications

ACM Transactions on Software Engineering and Methodology (TOSEM)
Fundamentals of speech recognition

Fundamentals of speech recognition
Improving Generalization with Active Learning

Machine Learning - Special issue on structured connectionist systems
State Transition Analysis: A Rule-Based Intrusion Detection Approach

IEEE Transactions on Software Engineering
Automating process discovery through event-data analysis

Proceedings of the 17th international conference on Software engineering
The use of program profiling for software maintenance with applications to the year 2000 problem

ESEC '97/FSE-5 Proceedings of the 6th European SOFTWARE ENGINEERING conference held jointly with the 5th ACM SIGSOFT international symposium on Foundations of software engineering
The Unified Modeling Language user guide

The Unified Modeling Language user guide
Cleanroom software engineering: technology and process

Cleanroom software engineering: technology and process
Finding failures by cluster analysis of execution profiles

ICSE '01 Proceedings of the 23rd International Conference on Software Engineering
Mining specifications

POPL '02 Proceedings of the 29th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Machine Learning

Machine Learning
Behavior Models Specifying Users Expectations

Behavior Models Specifying Users Expectations
Machine Learning for Sequential Data: A Review

Proceedings of the Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition
Improving test suites via operational abstraction

Proceedings of the 25th International Conference on Software Engineering
Automated support for classifying software failure reports

Proceedings of the 25th International Conference on Software Engineering
Markov Chains, Classifiers, and Intrusion Detection

CSFW '01 Proceedings of the 14th IEEE workshop on Computer Security Foundations
Software Reliability as a Function of User Execution Patterns

HICSS '99 Proceedings of the Thirty-second Annual Hawaii International Conference on System Sciences-Volume 8 - Volume 8
Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)
Finding Latent Code Errors via Machine Learning over Program Executions

Proceedings of the 26th International Conference on Software Engineering
Software Reliability Engineering: More Reliable Software Faster and Cheaper

Software Reliability Engineering: More Reliable Software Faster and Cheaper

Profiling Deployed Software: Assessing Strategies and Testing Opportunities

IEEE Transactions on Software Engineering
Applying classification techniques to remotely-collected program execution data

Proceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on Foundations of software engineering
Balancing Exploration and Exploitation: A New Algorithm for Active Machine Learning

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Inferring operational requirements from scenarios and goal models using inductive learning

Proceedings of the 2006 international workshop on Scenarios and state machines: models, algorithms, and tools
Failure proximity: a fault localization-based approach

Proceedings of the 14th ACM SIGSOFT international symposium on Foundations of software engineering
An empirical comparison between direct and indirect test result checking approaches

Proceedings of the 3rd international workshop on Software quality assurance
On-line anomaly detection of deployed software: a statistical machine learning approach

Proceedings of the 3rd international workshop on Software quality assurance
Improved error reporting for software that uses black-box components

Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
Techniques for Classifying Executions of Deployed Software to Support Software Engineering Tasks

IEEE Transactions on Software Engineering
Debugging in Parallel

Proceedings of the 2007 international symposium on Software testing and analysis
Introduction to the special issue on: "Software Quality Improvements and Estimations with Intelligence-based Methods"

Software Quality Control
Analysis of a deployed software

Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
Analysis of a deployed software

The 6th Joint Meeting on European software engineering conference and the ACM SIGSOFT symposium on the foundations of software engineering: companion papers
Context-aware statistical debugging: from bug predictors to faulty control flow paths

Proceedings of the twenty-second IEEE/ACM international conference on Automated software engineering
Reducing irrelevant trace variations

Proceedings of the twenty-second IEEE/ACM international conference on Automated software engineering
Predicting buggy changes inside an integrated development environment

Proceedings of the 2007 OOPSLA workshop on eclipse technology eXchange
The probabilistic program dependence graph and its application to fault diagnosis

ISSTA '08 Proceedings of the 2008 international symposium on Software testing and analysis
Grid Application Fault Diagnosis Using Wrapper Services and Machine Learning

ICSOC '07 Proceedings of the 5th international conference on Service-Oriented Computing
A Learning Approach to Early Bug Prediction in Deployed Software

AIMSA '08 Proceedings of the 13th international conference on Artificial Intelligence: Methodology, Systems, and Applications
Profile-guided program simplification for effective testing and analysis

Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering
PAT: A pattern classification approach to automatic reference oracles for the testing of mesh simplification programs

Journal of Systems and Software
Isolation points: Creating performance-robust enterprise systems

ACM Transactions on Autonomous and Adaptive Systems (TAAS)
Classification of software behaviors for failure detection: a discriminative pattern mining approach

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Using machine learning to refine Category-Partition test specifications and test suites

Information and Software Technology
Software testing by active learning for commercial games

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Combining hardware and software instrumentation to classify program executions

Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering
Software intelligence: the future of mining software engineering data

Proceedings of the FSE/SDP workshop on Future of software engineering research
Bayesian reasoning for software testing

Proceedings of the FSE/SDP workshop on Future of software engineering research
F007: finding rediscovered faults from the field using function-level failed traces of software in the field

Proceedings of the 2010 Conference of the Center for Advanced Studies on Collaborative Research
AutoBlackTest: a tool for automatic black-box testing

Proceedings of the 33rd International Conference on Software Engineering
Extracting the representative failure executions via clustering analysis based on markov profile model

ADMA'05 Proceedings of the First international conference on Advanced Data Mining and Applications
Eclat: automatic generation and classification of test inputs

ECOOP'05 Proceedings of the 19th European conference on Object-Oriented Programming
Generalizing evolutionary coupling with stochastic dependencies

ASE '11 Proceedings of the 2011 26th IEEE/ACM International Conference on Automated Software Engineering
Multi-label software behavior learning

Proceedings of the 34th International Conference on Software Engineering
Diversity maximization speedup for fault localization

Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering
Linking software testing results with a machine learning approach

Engineering Applications of Artificial Intelligence
Automated oracles: an empirical study on cost and effectiveness

Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering
Building a second opinion: learning cross-company data

Proceedings of the 9th International Conference on Predictive Models in Software Engineering
Is this a bug or an obsolete test?

ECOOP'13 Proceedings of the 27th European conference on Object-Oriented Programming
An empirical study on the use of mutant traces for diagnosis of faults in deployed systems

Journal of Systems and Software

Quantified Score

Hi-index	0.02

Visualization

Abstract

A program's behavior is ultimately the collection of all its executions. This collection is diverse, unpredictable, and generally unbounded. Thus it is especially suited to statistical analysis and machine learning techniques. The primary focus of this paper is on the automatic classification of program behavior using execution data. Prior work on classifiers for software engineering adopts a classical batch-learning approach. In contrast, we explore an active-learning paradigm for behavior classification. In active learning, the classifier is trained incrementally on a series of labeled data elements. Secondly, we explore the thesis that certain features of program behavior are stochastic processes that exhibit the Markov property, and that the resultant Markov models of individual program executions can be automatically clustered into effective predictors of program behavior. We present a technique that models program executions as Markov models, and a clustering method for Markov models that aggregates multiple program executions into effective behavior classifiers. We evaluate an application of active learning to the efficient refinement of our classifiers by conducting three empirical studies that explore a scenario illustrating automated test plan augmentation.