Data Mining of Software Development Databases

Authors:
Taghi M. Khoshgoftaar;Edward B. Allen;Wendell D. Jones;John P. Hudepohl
Affiliations:
Florida Atlantic University, Boca Raton, Florida, USA taghi@cse.fau.edu;Mississippi State University, Mississippi, USA edward.allen@computer.org;IBM, P.O. Box 12195, 600 Park Office Drive, Research Triangle Park, North Carolina, USA wendellj@us.ibm.com;Nortel Networks, Research Triangle Park, North Carolina, USA hudepohl@nortelnetworks.com
Venue:
Software Quality Control
Year:
2001

Citing 15
Cited 4

System acquisition based on software product assessment

Proceedings of the 18th international conference on Software engineering
The KDD process for extracting useful knowledge from volumes of data

Communications of the ACM
Mining scientific data

Communications of the ACM
Software metrics (2nd ed.): a rigorous and practical approach

Software metrics (2nd ed.): a rigorous and practical approach
Prediction of software quality using classification tree modeling

Prediction of software quality using classification tree modeling
Applying Software Metrics

Applying Software Metrics
Empirically Guided Software Development Using Metric-Based Classification Trees

IEEE Software
Early Quality Prediction: A Case Study in Telecommunications

IEEE Software
Emerald: Software Metrics and Models on the Desktop

IEEE Software
Status Report on Software Measurement

IEEE Software
Data Mining and Knowledge Discovery: Making Sense Out of Data

IEEE Expert: Intelligent Systems and Their Applications
Using Classification Trees for Software Quality Models: Lessons Learned

HASE '98 The 3rd IEEE International Symposium on High-Assurance Systems Engineering
Application of a Usage Profile in Software Quality Models

CSMR '99 Proceedings of the Third European Conference on Software Maintenance and Reengineering
Assessing Uncertain Predictions of Software Quality

METRICS '99 Proceedings of the 6th International Symposium on Software Metrics
Building Software Quality Classification Trees: Approach, Experimentation, Evaluation

ISSRE '97 Proceedings of the Eighth International Symposium on Software Reliability Engineering

Predicting risky modules in open-source software for high-performance computing

Proceedings of the second international workshop on Software engineering for high performance computing system applications
Quantitative analysis of faults and failures with multiple releases of softpm

Proceedings of the Second ACM-IEEE international symposium on Empirical software engineering and measurement
SPDW+: a seamless approach for capturing quality metrics in software development environments

Software Quality Control
Knowledge augmentation via incremental clustering: new technology for effective knowledge management

International Journal of Business Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Software quality models can predict which modules will have high risk, enabling developers to target enhancement activities to the most problematic modules. However, many find collection of the underlying software product and process metrics a daunting task.Many software development organizations routinely use very large databases for project management, configuration management, and problem reporting which record data on events during development. These large databases can be an unintrusive source of data for software quality modeling. However, multiplied by many releases of a legacy system or a broad product line, the amount of data can overwhelm manual analysis. The field of data mining is developing ways to find valuable bits of information in very large databases. This aptly describes our software quality modeling situation.This paper presents a case study that applied data mining techniques to software quality modeling of a very large legacy telecommunications software system's configuration management and problem reporting databases. The case study illustrates how useful models can be built and applied without interfering with development.