Investigating data quality problems in the PSP

  • Authors:
  • Anne M. Disney;Philip M. Johnson

  • Affiliations:
  • Dept. of Information and Computer Sciences, University of Hawaii, Honolulu, HI;Dept. of Information and Computer Sciences, University of Hawaii, Honolulu, HI

  • Venue:
  • SIGSOFT '98/FSE-6 Proceedings of the 6th ACM SIGSOFT international symposium on Foundations of software engineering
  • Year:
  • 1998

Quantified Score

Hi-index 0.00

Visualization

Abstract

The Personal Software Process (PSP) is used by software engineers to gather and analyze data about their work. Published studies typically use data collected using the PSP to draw quantitative conclusions about its impact upon programmer behavior and product quality. However, our experience using PSP in both industrial and academic settings revealed problems both in collection of data and its later analysis. We hypothesized that these two kinds of data quality problems could make a significant impact upon the value of PSP measures. To test this hypothesis, we built a tool to automate the PSP and then examined 89 projects completed by ten subjects using the PSP manually in an educational setting. We discovered 1539 primary errors and categorized them by type, subtype, severity, and age. To examine the collection problem we looked at the 90 errors that represented impossible combinations of data and at other less concrete anomalies in Time Recording Logs and Defect Recording Logs. To examine the analysis problem we developed a rule set, corrected the errors as far as possible, and compared the original and corrected data. This resulted in significant differences for measures such as yield and the cost-performance ratio, confirming our hypothesis. Our results raise questions about the accuracy of manually collected and analyzed PSP data, indicate that integrated tool support may be required for high quality PSP data analysis, and suggest that external measures should be used when attempting to evaluate the impact of the PSP upon programmer behavior and product quality.