Ecological inference in empirical software engineering

Authors:
Daryl Posnett;Vladimir Filkov;Premkumar Devanbu
Affiliations:
Department of Computer Science, University of California, Davis, USA;Department of Computer Science, University of California, Davis, USA;Department of Computer Science, University of California, Davis, USA
Venue:
ASE '11 Proceedings of the 2011 26th IEEE/ACM International Conference on Automated Software Engineering
Year:
2011

Citing 23
Cited 16

A Validation of Object-Oriented Design Metrics as Quality Indicators

IEEE Transactions on Software Engineering
Splitting the organization and integrating the code: Conway's law revisited

Proceedings of the 21st international conference on Software engineering
Empirical studies of software engineering: a roadmap

Proceedings of the Conference on The Future of Software Engineering
On the criteria to be used in decomposing systems into modules

Communications of the ACM
Use of relative code churn measures to predict system defect density

Proceedings of the 27th international conference on Software engineering
Building Defect Prediction Models in Practice

IEEE Software
Mining metrics to predict component failures

Proceedings of the 28th international conference on Software engineering
Global software development in the freeBSD project

Proceedings of the 2006 international workshop on Global software development for the practitioner
Predicting component failures at design time

Proceedings of the 2006 ACM/IEEE international symposium on Empirical software engineering
Adequate and Precise Evaluation of Quality Models in Software Engineering Studies

PROMISE '07 Proceedings of the Third International Workshop on Predictor Models in Software Engineering
Predicting Defects for Eclipse

PROMISE '07 Proceedings of the Third International Workshop on Predictor Models in Software Engineering
Globally distributed software development project performance: an empirical analysis

Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
An Empirical Study of Class Sizes for Large Java Systems

APSEC '07 Proceedings of the 14th Asia-Pacific Software Engineering Conference
Data Mining Techniques for Building Fault-proneness Models in Telecom Java Software

ISSRE '07 Proceedings of the The 18th IEEE International Symposium on Software Reliability
The influence of organizational structure on software quality: an empirical case study

Proceedings of the 30th international conference on Software engineering
Socio-technical congruence: a framework for assessing the impact of technical and work dependencies on software development productivity

Proceedings of the Second ACM-IEEE international symposium on Empirical software engineering and measurement
Do too many cooks spoil the broth? Using the number of developers to enhance defect prediction models

Empirical Software Engineering
Benchmarking Classification Models for Software Defect Prediction: A Proposed Framework and Novel Findings

IEEE Transactions on Software Engineering
Predicting failures with developer networks and social network analysis

Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering
Does distributed development affect software quality?: an empirical case study of Windows Vista

Communications of the ACM - A Blind Person's Interaction with Technology
A systematic and comprehensive investigation of methods to build and evaluate fault prediction models

Journal of Systems and Software
Making class bias useful: a strategy of learning from imbalanced data

IDEAL'07 Proceedings of the 8th international conference on Intelligent data engineering and automated learning
Defect prediction from static code features: current results, limitations, new approaches

Automated Software Engineering

Bug prediction based on fine-grained module histories

Proceedings of the 34th International Conference on Software Engineering
Size doesn't matter?: on the value of software size features for effort estimation

Proceedings of the 8th International Conference on Predictive Models in Software Engineering
Extensions during software evolution: do objects meet their promise?

ECOOP'12 Proceedings of the 26th European conference on Object-Oriented Programming
Method-level bug prediction

Proceedings of the ACM-IEEE international symposium on Empirical software engineering and measurement
Recalling the "imprecision" of cross-project defect prediction

Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering
How, and why, process metrics are better

Proceedings of the 2013 International Conference on Software Engineering
Dual ecological measures of focus in software development

Proceedings of the 2013 International Conference on Software Engineering
Distributed development considered harmful?

Proceedings of the 2013 International Conference on Software Engineering
Sample size vs. bias in defect prediction

Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering
A cost-effectiveness criterion for applying software defect prediction models

Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering
Beyond data mining; towards "idea engineering"

Proceedings of the 9th International Conference on Predictive Models in Software Engineering
Evaluating a query framework for software evolution data

ACM Transactions on Software Engineering and Methodology (TOSEM) - Testing, debugging, and error handling, formal methods, lifecycle concerns, evolution and maintenance
How social Q&A sites are changing knowledge sharing in open source software communities

Proceedings of the 17th ACM conference on Computer supported cooperative work & social computing
LMES: A localized multi-estimator model to estimate software development effort

Engineering Applications of Artificial Intelligence
The evolution of the laws of software evolution: A discussion based on a systematic literature review

ACM Computing Surveys (CSUR)
Software defect prediction using Bayesian networks

Empirical Software Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Software systems are decomposed hierarchically, for example, into modules, packages and files. This hierarchical decomposition has a profound influence on evolvability, maintainability and work assignment. Hierarchical decomposition is thus clearly of central concern for empirical software engineering researchers; but it also poses a quandary. At what level do we study phenomena, such as quality, distribution, collaboration and productivity? At the level of files? packages? or modules? How does the level of study affect the truth, meaning, and relevance of the findings? In other fields it has been found that choosing the wrong level might lead to misleading or fallacious results. Choosing a proper level, for study, is thus vitally important for empirical software engineering research; but this issue hasn't thus far been explicitly investigated. We describe the related idea of ecological inference and ecological fallacy from sociology and epidemiology, and explore its relevance to empirical software engineering; we also present some case studies, using defect and process data from 18 open source projects to illustrate the risks of modeling at an aggregation level in the context of defect prediction, as well as in hypothesis testing.