Towards mining informal online data to guide component-reuse decisions

Authors:
Sanchit Karve;Christopher Scaffidi
Affiliations:
McAfee Software, Beaverton, OR, USA;Oregon State University, Corvallis, OR, USA
Venue:
Proceedings of the 16th International ACM Sigsoft symposium on Component-based software engineering
Year:
2013

Citing 18
Cited 0

A Validation of Object-Oriented Design Metrics as Quality Indicators

IEEE Transactions on Software Engineering
Experiences and results from initiating field defect prediction and product test prioritization efforts at ABB Inc.

Proceedings of the 28th international conference on Software engineering
Toward a Calculus of Confidence

ESC '07 Proceedings of the First International Workshop on The Economics of Software and Computation
Ranking reusability of software components using coupling metrics

Journal of Systems and Software
An empirical validation of object-oriented class complexity metrics and their ability to predict error-prone classes in highly iterative, or agile, software: a case study

Journal of Software Maintenance and Evolution: Research and Practice
Code Reuse in Open Source Software

Management Science
Predicting build failures using social network analysis on developer communication

ICSE '09 Proceedings of the 31st International Conference on Software Engineering
Using the web for language independent spellchecking and autocorrection

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Seven principles for selecting software packages

Communications of the ACM
Using traits of web macro scripts to predict reuse

Journal of Visual Languages and Computing
Change Bursts as Defect Predictors

ISSRE '10 Proceedings of the 2010 IEEE 21st International Symposium on Software Reliability Engineering
Evaluating Complexity, Code Churn, and Developer Activity Metrics as Indicators of Software Vulnerabilities

IEEE Transactions on Software Engineering
Modeling parameter and context dependencies in online architecture-level performance models

Proceedings of the 15th ACM SIGSOFT symposium on Component Based Software Engineering
Rapid prototyping of domain-specific architecture languages

Proceedings of the 15th ACM SIGSOFT symposium on Component Based Software Engineering
Towards modeling reconfiguration in hierarchical component architectures

Proceedings of the 15th ACM SIGSOFT symposium on Component Based Software Engineering
Controller patterns for component-based reactive control software systems

Proceedings of the 15th ACM SIGSOFT symposium on Component Based Software Engineering
Iterative and incremental development of component-based software architectures

Proceedings of the 15th ACM SIGSOFT symposium on Component Based Software Engineering
Reliability analysis in component-based development via probabilistic model checking

Proceedings of the 15th ACM SIGSOFT symposium on Component Based Software Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Online repositories provide components available for reuse, but this does not mean all such components are equally reusable. Components might be unreliable, overly specialized, or otherwise inappropriate for reuse. Repositories collect reviews, ratings, and other data intended to help software engineers choose components. But do these data actually provide any information related to reusability? If so, then how can such information be extracted from the data? To address this question, we analyzed online ratings, reviews and other data for nearly 1200 online components, computed statistics for each component based on these data, and used factor analysis to identify three groups of statistics (factors) that were each internally correlated. We then interviewed software engineers about the reusability of 36 other components and used linear regression to test how well the 3 factors actually corresponded to component reusability. We found that 2 of the 3 factors were indeed related to reusability. Specifically, the reusability of components could be predicted on the basis of component authors' prior work and the documentation provided about components. This result could be used in future work to develop enhanced search engines that highlight components which are potentially reusable and perhaps worthy of more time-consuming evaluation such as by applying formal methods. Additionally, our results reveal opportunities to improve online repositories through specific simplifications as well as enhancements.