Quantitative evaluation of software quality
ICSE '76 Proceedings of the 2nd international conference on Software engineering
Identifying Reasons for Software Changes Using Historic Databases
ICSM '00 Proceedings of the International Conference on Software Maintenance (ICSM'00)
The Journal of Machine Learning Research
An Information Retrieval Approach to Concept Location in Source Code
WCRE '04 Proceedings of the 11th Working Conference on Reverse Engineering
An introduction to ROC analysis
Pattern Recognition Letters - Special issue: ROC analysis in pattern recognition
The Detection and Classification of Non-Functional Requirements with Application to Early Aspects
RE '06 Proceedings of the 14th IEEE International Requirements Engineering Conference
Information Dashboard Design: The Effective Visual Communication of Data
Information Dashboard Design: The Effective Visual Communication of Data
Release Pattern Discovery via Partitioning: Methodology and Case Study
MSR '07 Proceedings of the Fourth International Workshop on Mining Software Repositories
Automatic labeling of multinomial topic models
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
A New Standard for Quality Requirements
IEEE Software
What do large commits tell us?: a taxonomical study of large commits
Proceedings of the 2008 international working conference on Mining software repositories
A theory of aspects as latent topics
Proceedings of the 23rd ACM SIGPLAN conference on Object-oriented programming systems languages and applications
ConcernLines: A timeline view of co-occurring concerns
ICSE '09 Proceedings of the 31st International Conference on Software Engineering
Towards an Ontology for Software Product Quality Attributes
ICIW '09 Proceedings of the 2009 Fourth International Conference on Internet and Web Applications and Services
A study of cross-validation and bootstrap for accuracy estimation and model selection
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
The WEKA data mining software: an update
ACM SIGKDD Explorations Newsletter
Apples-to-apples in cross-validation studies: pitfalls in classifier performance measurement
ACM SIGKDD Explorations Newsletter
Automated topic naming to support cross-project analysis of software maintenance activities
Proceedings of the 8th Working Conference on Mining Software Repositories
Hi-index | 0.00 |
Software repositories provide a deluge of software artifacts to analyze. Researchers have attempted to summarize, categorize, and relate these artifacts by using semi-unsupervised machine-learning algorithms, such as Latent Dirichlet Allocation (LDA). LDA is used for concept and topic analysis to suggest candidate word-lists or topics that describe and relate software artifacts. However, these word-lists and topics are difficult to interpret in the absence of meaningful summary labels. Current attempts to interpret topics assume manual labelling and do not use domain-specific knowledge to improve, contextualize, or describe results for the developers. We propose a solution: automated labelled topic extraction. Topics are extracted using LDA from commit-log comments recovered from source control systems. These topics are given labels from a generalizable cross-project taxonomy, consisting of non-functional requirements. Our approach was evaluated with experiments and case studies on three large-scale Relational Database Management System (RDBMS) projects: MySQL, PostgreSQL and MaxDB. The case studies show that labelled topic extraction can produce appropriate, context-sensitive labels that are relevant to these projects, and provide fresh insight into their evolving software development activities.