Extracting Meaning from Abbreviated Identifiers

Authors:
Dawn Lawrie;Henry Feild;David Binkley
Affiliations:
Loyola College, USA;Loyola College, USA;Loyola College, USA
Venue:
SCAM '07 Proceedings of the Seventh IEEE International Working Conference on Source Code Analysis and Manipulation
Year:
2007

Citing 0
Cited 7

AMAP: automatically mining abbreviation expansions in programs to enhance software maintenance tools

Proceedings of the 2008 international working conference on Mining software repositories
Automatically capturing source code context of NL-queries for software maintenance and reuse

ICSE '09 Proceedings of the 31st International Conference on Software Engineering
Source code indexing for automated tracing

Proceedings of the 6th International Workshop on Traceability in Emerging Forms of Software Engineering
Semantic enrichment process: An approach to software component reuse in modernizing enterprise systems

Information Systems Frontiers
Source code identifier splitting using Yahoo image and web search engine

Proceedings of the First International Workshop on Software Mining
Risk chain prediction metrics for predicting fault proneness in object oriented systems

Proceedings of the Second International Conference on Computational Science, Engineering and Information Technology
Supporting concept location through identifier parsing and ontology extraction

Journal of Systems and Software

Quantified Score

Hi-index	0.00

Visualization

Abstract

Informative identifiers are made up of full (natural language) words and (meaningful) abbreviations. Readers of programs typically have little trouble understanding the purpose of identifiers composed of full words. In addition, those familiar with the code can (most often) determine the meaning of abbreviations used in identifiers. However, when faced with unfamiliar code, abbreviations often carry little useful information. Furthermore, tools that focus on the natural language used in the code have a hard time in the presence of abbreviations. One approach to providing meaning to programmers and tools is to translate (expand) abbreviations into full words. This paper presents a methodology for expanding identifiers and evaluates the process on a code based of just over 35 million lines of code. For example, using phrase extraction, fs exists is expanded to file status exists illustrating how the expansion process can facilitate comprehension. On average, 16 percent of the identifiers in a program are expanded. Finally, as an example application, the approach is used to improve the syntactic identification of violations to Deiβenb篓ock and Pizka's rules for concise and consistent identifier construction.