Foundations of statistical natural language processing
Foundations of statistical natural language processing
Proceedings of the 19th IEEE international conference on Automated software engineering
Using structural context to recommend source code examples
Proceedings of the 27th international conference on Software engineering
Topic modeling: beyond bag-of-words
ICML '06 Proceedings of the 23rd international conference on Machine learning
Using task context to improve programmer productivity
Proceedings of the 14th ACM SIGSOFT international symposium on Foundations of software engineering
Enabling static analysis for partial java programs
Proceedings of the 23rd ACM SIGPLAN conference on Object-oriented programming systems languages and applications
A theory of aspects as latent topics
Proceedings of the 23rd ACM SIGPLAN conference on Object-oriented programming systems languages and applications
Learning from examples to improve code completion systems
Proceedings of the the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
MAPO: Mining and Recommending API Usage Patterns
Genoa Proceedings of the 23rd European Conference on ECOOP 2009 --- Object-Oriented Programming
How Program History Can Improve Code Completion
ASE '08 Proceedings of the 2008 23rd IEEE/ACM International Conference on Automated Software Engineering
Code Completion from Abbreviated Input
ASE '09 Proceedings of the 2009 IEEE/ACM International Conference on Automated Software Engineering
A study of the uniqueness of source code
Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering
Code template inference using language models
Proceedings of the 48th Annual Southeast Regional Conference
An evaluation of the strategies of sorting, filtering, and grouping API methods for Code Completion
ICSM '11 Proceedings of the 2011 27th IEEE International Conference on Software Maintenance
Graph-based pattern-oriented, context-sensitive source code completion
Proceedings of the 34th International Conference on Software Engineering
Automatic parameter recommendation for practical API usage
Proceedings of the 34th International Conference on Software Engineering
On the naturalness of software
Proceedings of the 34th International Conference on Software Engineering
Proceedings of the 34th International Conference on Software Engineering
Duplicate bug report detection with a combination of information retrieval and topic modeling
Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering
Hi-index | 0.00 |
Recent research has successfully applied the statistical n-gram language model to show that source code exhibits a good level of repetition. The n-gram model is shown to have good predictability in supporting code suggestion and completion. However, the state-of-the-art n-gram approach to capture source code regularities/patterns is based only on the lexical information in a local context of the code units. To improve predictability, we introduce SLAMC, a novel statistical semantic language model for source code. It incorporates semantic information into code tokens and models the regularities/patterns of such semantic annotations, called sememes, rather than their lexemes. It combines the local context in semantic n-grams with the global technical concerns/functionality into an n-gram topic model, together with pairwise associations of program elements. Based on SLAMC, we developed a new code suggestion method, which is empirically evaluated on several projects to have relatively 18-68% higher accuracy than the state-of-the-art approach.