Learning from Examples: Generation and Evaluation of Decision Trees for Software Resource Analysis

Authors:
R. W. Selby;A. A. Porter
Affiliations:
Univ. of California, Irvine;Univ. of California, Irvine
Venue:
IEEE Transactions on Software Engineering - Special Issue on Artificial Intelligence in Software Applications
Year:
1988

Citing 6
Cited 51

Artificial intelligence (2nd ed.)

Artificial intelligence (2nd ed.)
Inductive knowledge acquisition: a case study

Proceedings of the Second Australian Conference on Applications of expert systems
Calculation and use of an environment's characteristic software metric set

ICSE '85 Proceedings of the 8th international conference on Software engineering
An interference matching technique for inducing abstractions

Communications of the ACM
Software Engineering Economics

Software Engineering Economics
A meta-model for software development resource expenditures

ICSE '81 Proceedings of the 5th international conference on Software engineering

A Pattern Recognition Approach for Software Engineering Data Analysis

IEEE Transactions on Software Engineering - Special issue on software measurement principles, techniques, and environments
Developing Interpretable Models with Optimized set Reduction for Identifying High-Risk Software Components

IEEE Transactions on Software Engineering - Special issue on software reliability
In-process improvement through defect data interpretation

IBM Systems Journal
Machine Learning Approaches to Estimating Software Development Effort

IEEE Transactions on Software Engineering
A Validation of Object-Oriented Design Metrics as Quality Indicators

IEEE Transactions on Software Engineering
Characterizing and modeling the cost of rework in a library of reusable software components

ICSE '97 Proceedings of the 19th international conference on Software engineering
An investigation into coupling measures for C++

ICSE '97 Proceedings of the 19th international conference on Software engineering
Metric-driven analysis and feedback systems for enabling empirically guided software development

ICSE '91 Proceedings of the 13th international conference on Software engineering
Modeling and managing risk early in software development

ICSE '93 Proceedings of the 15th international conference on Software Engineering
Post-process feedback with and without attribute focusing: a comparative evaluation

ICSE '93 Proceedings of the 15th international conference on Software Engineering
Integration of complexity metrics with the use of decision trees

ACM SIGSOFT Software Engineering Notes
Validation of an Approach for Improving Existing Measurement Frameworks

IEEE Transactions on Software Engineering
Comparing Software Prediction Techniques Using Simulation

IEEE Transactions on Software Engineering - Special section on the seventh international software metrics symposium
A comparison of case-based reasoning approaches

Proceedings of the 11th international conference on World Wide Web
Evaluating the applicability of reliability prediction models between different software

Proceedings of the International Workshop on Principles of Software Evolution
Deriving models of software fault-proneness

SEKE '02 Proceedings of the 14th international conference on Software engineering and knowledge engineering
An empirical evaluation of fault-proneness models

Proceedings of the 24th International Conference on Software Engineering
Accuracy of software quality models over multiple releases

Annals of Software Engineering
Classification of Fault-Prone Software Modules: Prior Probabilities,Costs, and Model Evaluation

Empirical Software Engineering
Controlling Overfitting in Classification-Tree Models ofSoftware Quality

Empirical Software Engineering
Uncertain Classification of Fault-Prone Software Modules

Empirical Software Engineering
Balancing Misclassification Rates in Classification-TreeModels of Software Quality

Empirical Software Engineering
Empirically Guided Software Development Using Metric-Based Classification Trees

IEEE Software
Complexity Measure Evaluation and Selection

IEEE Transactions on Software Engineering
Integrating Time Domain and Input Domain Analyses of Software Reliability Using Tree-Based Models

IEEE Transactions on Software Engineering
On the use of machine-assisted knowledge discovery to analyze and reengineer measurement frameworks

CASCON '95 Proceedings of the 1995 conference of the Centre for Advanced Studies on Collaborative research
Evaluation and Application of Complexity-Based Criticality Models

METRICS '96 Proceedings of the 3rd International Symposium on Software Metrics: From Measurement to Empirical Results
Detection of software modules with high debug code churn in a very large legacy system

ISSRE '96 Proceedings of the The Seventh International Symposium on Software Reliability Engineering
Classification Tree Models of Software Quality Over Multiple Releases

ISSRE '99 Proceedings of the 10th International Symposium on Software Reliability Engineering
Improving Tree-Based Models of Software Quality with Principal Components Analysis

ISSRE '00 Proceedings of the 11th International Symposium on Software Reliability Engineering
Modeling Fault-Prone Modules of Subsystems

ISSRE '00 Proceedings of the 11th International Symposium on Software Reliability Engineering
An empirical comparison and characterization of high defect and high complexity modules

Journal of Systems and Software
Selecting a Cost-Effective Test Case Prioritization Technique

Software Quality Control
An approach to improving existing measurement frameworks

IBM Systems Journal
Preliminary Data Analysis Methods in Software Estimation

Software Quality Control
Enabling Reuse-Based Software Development of Large-Scale Systems

IEEE Transactions on Software Engineering
Experiences and results from initiating field defect prediction and product test prioritization efforts at ABB Inc.

Proceedings of the 28th international conference on Software engineering
Predicting risky modules in open-source software for high-performance computing

Proceedings of the second international workshop on Software engineering for high performance computing system applications
Adequate and Precise Evaluation of Quality Models in Software Engineering Studies

PROMISE '07 Proceedings of the Third International Workshop on Predictor Models in Software Engineering
An investigation of artificial neural networks based prediction systems in software project management

Journal of Systems and Software
Predicting defect-prone software modules using support vector machines

Journal of Systems and Software
Techniques for evaluating fault prediction models

Empirical Software Engineering
Can k-NN imputation improve the performance of C4.5 with small software project data sets? A comparative evaluation

Journal of Systems and Software
Web Cost Estimation and Productivity Benchmarking

Software Engineering
Software project effort estimation with voting rules

Decision Support Systems
A tree-based approach to preserve the privacy of software engineering data and predictive models

PROMISE '09 Proceedings of the 5th International Conference on Predictor Models in Software Engineering
Fault-prone module prediction of a web application using artificial neural networks

SEA '07 Proceedings of the 11th IASTED International Conference on Software Engineering and Applications
Data collection, analysis, and sharing strategies for enabling software measurement and model building

Proceedings of the 2006 international conference on Empirical software engineering issues: critical assessment and future directions
Ensemble missing data techniques for software effort prediction

Intelligent Data Analysis
An ant colony optimization algorithm to improve software quality prediction models: Case of class stability

Information and Software Technology
An empirical evaluation of outlier deletion methods for analogy-based cost estimation

Proceedings of the 7th International Conference on Predictive Models in Software Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

A general solution method for the automatic generation of decision (or classification) trees is investigated. The approach is to provide insights through in-depth empirical characterization and evaluation of decision trees for one problem domain, specifically, that of software resource data analysis. The purpose of the decision trees is to identify classes of objects (software modules) that had high development effort, i.e. in the uppermost quartile relative to past data. Sixteen software systems ranging from 3000 to 112000 source lines have been selected for analysis from a NASA production environment. The collection and analysis of 74 attributes (or metrics), for over 4700 objects, capture a multitude of information about the objects: development effort, faults, changes, design style, and implementation style. A total of 9600 decision trees are automatically generated and evaluated. The analysis focuses on the characterization and evaluation of decision tree accuracy, complexity, and composition. The decision trees correctly identified 79.3% of the software modules that had high development effort or faults, on the average across all 9600 trees. The decision trees generated from the best parameter combinations correctly identified 88.4% of the modules on the average. Visualization of the results is emphasized, and sample decision trees are included.