Statistical analysis with missing data
Statistical analysis with missing data
Communications of the ACM
Communications of the ACM
Investigating data quality problems in the PSP
SIGSOFT '98/FSE-6 Proceedings of the 6th ACM SIGSOFT international symposium on Foundations of software engineering
A Critique of Software Defect Prediction Models
IEEE Transactions on Software Engineering
Data Quality for the Information Age
Data Quality for the Information Age
A Critical Analysis of PSP Data Quality: Results from aCase Study
Empirical Software Engineering
A Replicated Empirical Study of the Impact of the Methodsin the PSP on Individual Engineers
Empirical Software Engineering
Using a Reliability Growth Model to Control Software Inspection
Empirical Software Engineering
The Personal Software Process: A Cautionary Case Study
IEEE Software
A Further Empirical Investigation of the Relationship Between MRE and Project Size
Empirical Software Engineering
A Comparative Study of Cost Estimation Models for Web Hypermedia Applications
Empirical Software Engineering
Data Quality Requirements Analysis and Modeling
Proceedings of the Ninth International Conference on Data Engineering
Software quality measurement and modeling, maturity, control and improvement
ISESS '95 Proceedings of the 2nd IEEE Software Engineering Standards Symposium
Knowledge-Sharing Issues in Experimental Software Engineering
Empirical Software Engineering
Report on the Dagstuhl Seminar
ACM SIGMOD Record
Software Productivity Analysis of a Large Data Set and Issues of Confidentiality and Data Quality
METRICS '05 Proceedings of the 11th IEEE International Software Metrics Symposium
Personal Software Process (PSP) Assistant
APSEC '05 Proceedings of the 12th Asia-Pacific Software Engineering Conference
Identifying Noise in an Attribute of Interest
ICMLA '05 Proceedings of the Fourth International Conference on Machine Learning and Applications
Empirical Software Engineering
The pairwise attribute noise detection algorithm
Knowledge and Information Systems - Special Issue on Mining Low-Quality Data
Quality of manual data collection in Java software: an empirical investigation
Empirical Software Engineering
Quality, productivity and economic benefits of software reuse: a review of industrial studies
Empirical Software Engineering
Filtering, Robust Filtering, Polishing: Techniques for Addressing Quality in Software Data
ESEM '07 Proceedings of the First International Symposium on Empirical Software Engineering and Measurement
Replicating studies on cross- vs single-company effort models using the ISBSG Database
Empirical Software Engineering
Improving software quality prediction by noise filtering techniques
Journal of Computer Science and Technology
A comprehensive empirical evaluation of missing value imputation in noisy software measurement data
Journal of Systems and Software
A statistical framework for analyzing the duration of software projects
Empirical Software Engineering
Identifying and eliminating mislabeled training instances
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
Assessing the quality and cleaning of a software project dataset: an experience report
EASE'06 Proceedings of the 10th international conference on Evaluation and Assessment in Software Engineering
Fair and balanced?: bias in bug-fix datasets
Proceedings of the the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
Fuzzy grey relational analysis for software effort estimation
Empirical Software Engineering
Systematic literature reviews in software engineering - A tertiary study
Information and Software Technology
Empirical Software Engineering
Refining the systematic literature review process--two participant-observer case studies
Empirical Software Engineering
Proceedings of the 6th International Conference on Predictive Models in Software Engineering
The missing links: bugs and bug-fix commits
Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering
Data quality: cinderella at the software metrics ball?
Proceedings of the 2nd International Workshop on Emerging Trends in Software Metrics
ReLink: recovering links between bugs and changes
Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering
Empirical Software Engineering
StatREC: a graphical user interface tool for visual hypothesis testing of cost prediction models
Proceedings of the 8th International Conference on Predictive Models in Software Engineering
Multi-layered approach for recovering links between bug reports and fixes
Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering
Data quality in empirical software engineering: a targeted review
Proceedings of the 17th International Conference on Evaluation and Assessment in Software Engineering
It's not a bug, it's a feature: how misclassification impacts bug prediction
Proceedings of the 2013 International Conference on Software Engineering
Information and Software Technology
Hi-index | 0.00 |
OBJECTIVE - to assess the extent and types of techniques used to manage quality within software engineering data sets. We consider this a particularly interesting question in the context of initiatives to promote sharing and secondary analysis of data sets. METHOD - we perform a systematic review of available empirical software engineering studies. RESULTS - only 23 out of the many hundreds of studies assessed, explicitly considered data quality. CONCLUSIONS - first, the community needs to consider the quality and appropriateness of the data set being utilised; not all data sets are equal. Second, we need more research into means of identifying, and ideally repairing, noisy cases. Third, it should become routine to use sensitivity analysis to assess conclusion stability with respect to the assumptions that must be made concerning noise levels.