The Elements of Java Style
Recovering Traceability Links between Code and Documentation
IEEE Transactions on Software Engineering
Nomen Est Omen: Analyzing the Language of Function Identifiers
WCRE '99 Proceedings of the Sixth Working Conference on Reverse Engineering
Static Techniques for Concept Location in Object-Oriented Code
IWPC '05 Proceedings of the 13th International Workshop on Program Comprehension
Semantic clustering: Identifying topics in source code
Information and Software Technology
Quantifying identifier quality: an analysis of trends
Empirical Software Engineering
Indexing the Java API Using Source Code
ASWEC '08 Proceedings of the 19th Australian Conference on Software Engineering
Extracting Domain Ontologies from Domain Specific APIs
CSMR '08 Proceedings of the 2008 12th European Conference on Software Maintenance and Reengineering
Mining source code to automatically split identifiers for software analysis
MSR '09 Proceedings of the 2009 6th IEEE International Working Conference on Mining Software Repositories
Genoa Proceedings of the 23rd European Conference on ECOOP 2009 --- Object-Oriented Programming
Natural Language Parsing of Program Element Names for Concept Extraction
ICPC '10 Proceedings of the 2010 IEEE 18th International Conference on Program Comprehension
Recognizing Words from Source Code Identifiers Using Speech Recognition Techniques
CSMR '10 Proceedings of the 2010 14th European Conference on Software Maintenance and Reengineering
Exploring the Influence of Identifier Names on Code Quality: An Empirical Study
CSMR '10 Proceedings of the 2010 14th European Conference on Software Maintenance and Reengineering
Mining Java class identifier naming conventions
Proceedings of the 34th International Conference on Software Engineering
Source code identifier splitting using Yahoo image and web search engine
Proceedings of the First International Workshop on Software Mining
What is middleware made of?: exploring abstractions, concepts, and class names in modern middleware
Proceedings of the 11th International Workshop on Adaptive and Reflective Middleware
Improving feature location using structural similarity and iterative graph mapping
Journal of Systems and Software
Proceedings of the 10th Working Conference on Mining Software Repositories
A dataset for evaluating identifier splitters
Proceedings of the 10th Working Conference on Mining Software Repositories
INVocD: identifier name vocabulary dataset
Proceedings of the 10th Working Conference on Mining Software Repositories
Hi-index | 0.00 |
Identifier names are the main vehicle for semantic information during program comprehension. Identifier names are tokenised into their semantic constituents by tools supporting program comprehension tasks, including concept location and requirements traceability. We present an approach to the automated tokenisation of identifier names that improves on existing techniques in two ways. First, it improves tokenisation accuracy for identifier names of a single case and those containing digits. Second, performance gains over existing techniques are achieved using smaller oracles. Accuracy was evaluated by comparing the output of our algorithm to manual tokenisations of 28,000 identifier names drawn from 60 open source Java projects totalling 16.5 MSLOC. We also undertook a study of the typographical features of identifier names (single case, use of digits, etc.) per object-oriented construct (class names, method names, etc.), thus providing an insight into naming conventions in industrial-scale object-oriented code. Our tokenisation tool and datasets are publicly available.