Automatic quality assessment of source code comments: the JavadocMiner
NLDB'10 Proceedings of the Natural language processing and information systems, and 15th international conference on Applications of natural language to information systems
An exploratory study of identifier renamings
Proceedings of the 8th Working Conference on Mining Software Repositories
Quantifying the similiarities between source code lexicons
Proceedings of the 49th Annual Southeast Regional Conference
Toward an understanding of the relationship between the identifier and comment lexicons
Proceedings of the 49th Annual Southeast Regional Conference
Supporting concept location through identifier parsing and ontology extraction
Journal of Systems and Software
Data & Knowledge Engineering
Hi-index | 0.00 |
Source code is a mixed software artifact, containing information for both the compiler and the developers. While programming language grammar dictates how the source code is written, developers have a lot of freedom in writing identifiers and comments. These are intentional in nature and become means of communication between developers.The goal of this paper is to analyze how the source code vocabulary changes during evolution, through an exploratory study of two software systems. Specifically, we collected data to answer a set of questions about the vocabulary evolution, such as: How does the size of the source code vocabulary evolve over time? What do most frequent terms refer to? Are new identifiers introducing new terms? Are there terms shared between different types of identifiers and comments? Are new and deleted terms in a type of identifiers mirrored in other types of identifiers or in comments?