Quality of manual data collection in Java software: an empirical investigation

Authors:
Steve Counsell;George Loizou;Rajaa Najjar
Affiliations:
School of Computing, Information Systems and Mathematics, Brunel University, Uxbridge, Middlesex, UK UB8 1PH;School of Computer Science and Information Systems, Birkbeck, University of London, London, UK WC1E 7HX;School of Computer Science and Information Systems, Birkbeck, University of London, London, UK WC1E 7HX
Venue:
Empirical Software Engineering
Year:
2007

Citing 17
Cited 3

ET++—an object oriented application framework in C++

OOPSLA '88 Conference proceedings on Object-oriented programming systems, languages and applications
Methodology for Validating Software Metrics

IEEE Transactions on Software Engineering
Design patterns: elements of reusable object-oriented software

Design patterns: elements of reusable object-oriented software
A Validation of Object-Oriented Design Metrics as Quality Indicators

IEEE Transactions on Software Engineering
Refactoring: improving the design of existing code

Refactoring: improving the design of existing code
The Confounding Effect of Class Size on the Validity of Object-Oriented Metrics

IEEE Transactions on Software Engineering
Modeling Software Measurement Data

IEEE Transactions on Software Engineering
Software Metrics: A Rigorous and Practical Approach

Software Metrics: A Rigorous and Practical Approach
An Experimental Comparison of the Maintainability of Object-Orientedand Structured Design Documents

Empirical Software Engineering
An Investigation into the Applicability and Validity ofObject-Oriented Design Metrics

Empirical Software Engineering
Software Quality: The Elusive Target

IEEE Software
Preliminary guidelines for empirical research in software engineering

IEEE Transactions on Software Engineering
An empirical study of maintenance and development estimation accuracy

Journal of Systems and Software
Coupling Metrics for Object-Oriented Design

METRICS '98 Proceedings of the 5th International Symposium on Software Metrics
Architectural Level Hypothesis Testing through Reverse Engineering of Object-Oriented Software

IWPC '00 Proceedings of the 8th International Workshop on Program Comprehension
The Role of Constructors in the Context of Refactoring Object-Oriented Systems

CSMR '03 Proceedings of the Seventh European Conference on Software Maintenance and Reengineering
Design Patterns and Change Proneness: An Examination of Five Evolving Systems

METRICS '03 Proceedings of the 9th International Symposium on Software Metrics

Data sets and data quality in software engineering

Proceedings of the 4th international workshop on Predictor models in software engineering
Class movement and re-location: An empirical study of Java inheritance evolution

Journal of Systems and Software
A Longitudinal Study of Fan-In and Fan-Out Coupling in Open-Source Systems

International Journal of Information System Modeling and Design

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data collection, both automatic and manual, lies at the heart of all empirical studies. The quality of data collected from software informs decisions on maintenance, testing and wider issues such as the need for system re-engineering. While of the two types stated, automatic data collection is preferable, there are numerous occasions when manual data collection is unavoidable. Yet, very little evidence exists to assess the error-proneness of the latter. Herein, we investigate the extent to which manual data collection for Java software compared with its automatic counterpart for the same data. We investigate three hypotheses relating to the difference between automated and manual data collection. Five Java systems were used to support our investigation. Results showed that, as expected, manual data collection was error-prone, but nowhere near the extent we had initially envisaged. Key indicators of mistakes in manual data collection were found to be poor developer coding style, poor adherence to sound OO coding principles, and the existence of relatively large classes in some systems. Some interesting results were found relating to the collection of public class features and the types of error made during manual data collection. The study thus offers an insight into some of the typical problems associated with collecting data manually; more significantly, it highlights the problems that poorly written systems have on the quality of visually extracted data.