Structured statistical syntax tree prediction

Authors:
Cyrus Omar
Affiliations:
Carnegie Mellon University, Pittsburgh, PA, USA
Venue:
Proceedings of the 2013 companion publication for conference on Systems, programming, & applications: software for humanity
Year:
2013

Citing 2
Cited 0

Learning from examples to improve code completion systems

Proceedings of the the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
On the naturalness of software

Proceedings of the 34th International Conference on Software Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Statistical models of source code can be used to improve code completion systems, assistive interfaces, and code compression engines. We are developing a statistical model where programs are represented as syntax trees, rather than simply a stream of tokens. Our model, initially for the Java language, combines corpus data with information about syn- tax, types and the program context. We tested this model using open source code corpuses and find that our model is significantly more accurate than the current state of the art, providing initial evidence for our claim that combining structural and statistical information is a fruitful strategy.