Cognitive strategies and looping constructs: an empirical study
Communications of the ACM
Recovering Traceability Links between Code and Documentation
IEEE Transactions on Software Engineering
Recovering documentation-to-source-code traceability links using latent semantic indexing
Proceedings of the 25th International Conference on Software Engineering
Assessing the relevance of identifier names in a legacy software system
CASCON '98 Proceedings of the 1998 conference of the Centre for Advanced Studies on Collaborative research
Nomen Est Omen: Analyzing the Language of Function Identifiers
WCRE '99 Proceedings of the Sixth Working Conference on Reverse Engineering
Restructuring Program Identifier Names
ICSM '00 Proceedings of the International Conference on Software Maintenance (ICSM'00)
A cognitive framework for describing and evaluating software exploration tools
A cognitive framework for describing and evaluating software exploration tools
Feed-forward and recurrent neural networks for source code informal information analysis
Journal of Software Maintenance: Research and Practice
How Effective Developers Investigate Source Code: An Exploratory Study
IEEE Transactions on Software Engineering
IWPC '05 Proceedings of the 13th International Workshop on Program Comprehension
3rd international workshop on traceability in emerging forms of software engineering (TEFSE 2005)
Proceedings of the 20th IEEE/ACM international Conference on Automated software engineering
What's in a Name? A Study of Identifiers
ICPC '06 Proceedings of the 14th IEEE International Conference on Program Comprehension
The Conceptual Coupling Metrics for Object-Oriented Systems
ICSM '06 Proceedings of the 22nd IEEE International Conference on Software Maintenance
Using task context to improve programmer productivity
Proceedings of the 14th ACM SIGSOFT international symposium on Foundations of software engineering
Using the Conceptual Cohesion of Classes for Fault Prediction in Object-Oriented Systems
IEEE Transactions on Software Engineering
Asking and Answering Questions during a Programming Change Task
IEEE Transactions on Software Engineering
Mining source code to automatically split identifiers for software analysis
MSR '09 Proceedings of the 2009 6th IEEE International Working Conference on Mining Software Repositories
Normalizing Source Code Vocabulary
WCRE '10 Proceedings of the 2010 17th Working Conference on Reverse Engineering
Can Better Identifier Splitting Techniques Help Feature Location?
ICPC '11 Proceedings of the 2011 IEEE 19th International Conference on Program Comprehension
Expanding identifiers to normalize source code vocabulary
ICSM '11 Proceedings of the 2011 27th IEEE International Conference on Software Maintenance
TRIS: A Fast and Accurate Identifiers Splitting and Expansion Algorithm
WCRE '12 Proceedings of the 2012 19th Working Conference on Reverse Engineering
Hi-index | 0.00 |
The literature reports that source code lexicon plays a paramount role in program comprehension, especially when software documentation is scarce, outdated or simply not available. In source code, a significant proportion of vocabulary can be either acronyms and-or abbreviations or concatenation of terms that can not be identified using consistent mechanisms such as naming conventions. It is, therefore, essential to disambiguate concepts conveyed by identifiers to support program comprehension and reap the full benefit of Information Retrieval-based techniques (e.g., feature location and traceability) whose linguistic information (i.e., source code identifiers and comments) used across all software artifacts (e.g., requirements, design, change requests, tests, and source code) must be consistent. To this aim, we propose source code vocabulary normalization approaches that exploit contextual information to align the vocabulary found in the source code with that found in other software artifacts. We were inspired in the choice of context levels by prior works and by our findings. Normalization consists of two tasks: splitting and expansion of source code identifiers. We also investigate the effect of source code vocabulary normalization approaches on software maintenance tasks. Results of our evaluation show that our contextual-aware techniques are accurate and efficient in terms of computation time than state of the art alternatives. In addition, our findings reveal that feature location techniques can benefit from vocabulary normalization approaches when no dynamic information is available.