The Mahler experience: using an intermediate language as the machine description
ASPLOS II Proceedings of the second international conference on Architectual support for programming languages and operating systems
Link-time optimization of address calculation on a 64-bit architecture
PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
EEL: machine-independent executable editing
PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Continuous profiling: where have all the cycles gone?
ACM Transactions on Computer Systems (TOCS)
Residual test coverage monitoring
Proceedings of the 21st international conference on Software engineering
Multivariate visualization in observation-based testing
Proceedings of the 22nd international conference on Software engineering
Extracting usability information from user interface events
ACM Computing Surveys (CSUR)
Mining needle in a haystack: classifying rare classes via two-phase rule induction
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
A framework for reducing the cost of instrumented code
Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation
Finding failures by cluster analysis of execution profiles
ICSE '01 Proceedings of the 23rd International Conference on Software Engineering
Pursuing failure: the distribution of program failures in a profile space
Proceedings of the 8th European software engineering conference held jointly with 9th ACM SIGSOFT international symposium on Foundations of software engineering
Machine Learning
Monitoring deployed software using software tomography
Proceedings of the 2002 ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools and engineering
AdaCost: Misclassification Cost-Sensitive Boosting
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Visualization of program-execution data for deployed software
Proceedings of the 2003 ACM symposium on Software visualization
Automated support for classifying software failure reports
Proceedings of the 25th International Conference on Software Engineering
Bug isolation via remote program sampling
PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
Leveraging field data for impact analysis and regression testing
Proceedings of the 9th European software engineering conference held jointly with 11th ACM SIGSOFT international symposium on Foundations of software engineering
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Kernel Methods for Pattern Analysis
Kernel Methods for Pattern Analysis
Covering arrays for efficient fault characterization in complex configuration spaces
ISSTA '04 Proceedings of the 2004 ACM SIGSOFT international symposium on Software testing and analysis
Active learning for automatic classification of software behavior
ISSTA '04 Proceedings of the 2004 ACM SIGSOFT international symposium on Software testing and analysis
Tree-Based Methods for Classifying Software Failures
ISSRE '04 Proceedings of the 15th International Symposium on Software Reliability Engineering
Scalable statistical bug isolation
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Profiling Deployed Software: Assessing Strategies and Testing Opportunities
IEEE Transactions on Software Engineering
Applying classification techniques to remotely-collected program execution data
Proceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on Foundations of software engineering
Selective capture and replay of program executions
WODA '05 Proceedings of the third international workshop on Dynamic analysis
Instrumentation and optimization of Win32/intel executables using Etch
NT'97 Proceedings of the USENIX Windows NT Workshop on The USENIX Windows NT Workshop 1997
Statistical debugging using compound boolean predicates
Proceedings of the 2007 international symposium on Software testing and analysis
Analysis of a deployed software
Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
Analysis of a deployed software
The 6th Joint Meeting on European software engineering conference and the ACM SIGSOFT symposium on the foundations of software engineering: companion papers
Mining Bug Classifier and Debug Strategy Association Rules for Web-Based Applications
ADMA '08 Proceedings of the 4th international conference on Advanced Data Mining and Applications
A Design Science Research Methodology for Information Systems Research
Journal of Management Information Systems
Using machine learning to refine Category-Partition test specifications and test suites
Information and Software Technology
Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1
Proceedings of the 2010 Conference of the Center for Advanced Studies on Collaborative Research
Diagnosing new faults using mutants and prior faults (NIER track)
Proceedings of the 33rd International Conference on Software Engineering
iTree: efficiently discovering high-coverage configurations using interaction trees
Proceedings of the 34th International Conference on Software Engineering
How much does unused code matter for maintenance?
Proceedings of the 34th International Conference on Software Engineering
Multi-label software behavior learning
Proceedings of the 34th International Conference on Software Engineering
An empirical study on the use of mutant traces for diagnosis of faults in deployed systems
Journal of Systems and Software
Hi-index | 0.01 |
There is an increasing interest in techniques that support analysis and measurement of fielded software systems. These techniques typically deploy numerous instrumented instances of a software system, collect execution data when the instances run in the field, and analyze the remotely collected data to better understand the system's in-the-field behavior. One common need for these techniques is the ability to distinguish execution outcomes (e.g., to collect only data corresponding to some behavior or to determine how often and under which condition a specific behavior occurs). Most current approaches, however, do not perform any kind of classification of remote executions and either focus on easily observable behaviors (e.g., crashes) or assume that outcomes' classifications are externally provided (e.g., by the users). To address the limitations of existing approaches, we have developed three techniques for automatically classifying execution data as belonging to one of several classes. In this paper, we introduce our techniques and apply them to the binary classification of passing and failing behaviors. Our three techniques impose different overheads on program instances and, thus, each is appropriate for different application scenarios. We performed several empirical studies to evaluate and refine our techniques and to investigate the trade-offs among them. Our results show that 1) the first technique can build very accurate models, but requires a complete set of execution data; 2) the second technique produces slightly less accurate models, but needs only a small fraction of the total execution data; and 3) the third technique allows for even further cost reductions by building the models incrementally, but requires some sequential ordering of the software instances' instrumentation.