Estimating understandability of software documents
ACM SIGSOFT Software Engineering Notes
Recovering Traceability Links between Code and Documentation
IEEE Transactions on Software Engineering
Restructuring Program Identifier Names
ICSM '00 Proceedings of the International Conference on Software Maintenance (ICSM'00)
Identifying Comprehension Bottlenecks Using Program Slicing and Cognitive Complexity Metrics
IWPC '03 Proceedings of the 11th IEEE International Workshop on Program Comprehension
IWPC '05 Proceedings of the 13th International Workshop on Program Comprehension
A Survey of Controlled Experiments in Software Engineering
IEEE Transactions on Software Engineering
What's in a Name? A Study of Identifiers
ICPC '06 Proceedings of the 14th IEEE International Conference on Program Comprehension
The impact of identifier style on effort and comprehension
Empirical Software Engineering
A dataset for evaluating identifier splitters
Proceedings of the 10th Working Conference on Mining Software Repositories
Hi-index | 0.00 |
Because early variable mnemonics were limited to as few as six to eight characters, many early programmers abbreviated concepts in their variable names. The past thirty years have seen a steady increase in permitted name length and, slowly, an increase in the actual identifier length. However, in theory names can be too long for programmers to comprehend and manipulate effectively. Most obviously, in object-oriented programs, entity naming often involves chaining of method calls and field selectors (e.g., class.firstAssignment().name.trim()). While longer names bring the potential for better comprehension through more embedded sub-words, there are practical limits to their length given limited human memory resources. The driving hypothesis behind the presented study is that names used in modern programs have reached this limit. Thus, a goal of the study is to better understand the balance between longer, more expressive names and limited programmer memory resources. Statistical models derived from an experiment involving 158 programmers of varying degrees of experience show that longer names extracted from production code take more time to process and reduce correctness in a simple recall activity. This has clear negative implications for any attempt to read, and hence comprehend or manipulate, the source code found in modern software. The experiment also evaluates the advantage of identifiers having probable ties to a programmer's persistent memory. Combined, these results reinforce past proposals advocating the use of limited, consistent, and regular vocabulary in identifier names. In particular, good naming limits individual name length and reduces the need for specialized vocabulary.