Program Translation Via Abstraction and Reimplementation
IEEE Transactions on Software Engineering
SPiCE: A System for Translating Smalltalk Programs Into a C Environment
IEEE Transactions on Software Engineering
Automated Cobol to Java Recycling
CSMR '03 Proceedings of the Seventh European Conference on Software Maintenance and Reengineering
BLEU: a method for automatic evaluation of machine translation
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Recommending adaptive changes for framework evolution
Proceedings of the 30th international conference on Software engineering
Statistical Machine Translation
Statistical Machine Translation
Mining API mapping for language migration
Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1
Using twinning to adapt programs to alternative APIs
Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1
AURA: a hybrid approach to identify framework evolution
Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1
HLT-DEMO '10 Proceedings of the NAACL HLT 2010 Demonstration Session
A history-based matching approach to identification of framework evolution
Proceedings of the 34th International Conference on Software Engineering
On the naturalness of software
Proceedings of the 34th International Conference on Software Engineering
Hi-index | 0.00 |
Prior research has shown that source code also exhibits naturalness, i.e. it is written by humans and is likely to be repetitive. The researchers also showed that the n-gram language model is useful in predicting the next token in a source file given a large corpus of existing source code. In this paper, we investigate how well statistical machine translation (SMT) models for natural languages could help in migrating source code from one programming language to another. We treat source code as a sequence of lexical tokens and apply a phrase-based SMT model on the lexemes of those tokens. Our empirical evaluation on migrating two Java projects into C# showed that lexical, phrase-based SMT could achieve high lexical translation accuracy (BLEU from 81.3-82.6%). Users would have to manually edit only 11.9-15.8% of the total number of tokens in the resulting code to correct it. However, a high percentage of total translation methods (49.5-58.6%) is syntactically incorrect. Therefore, our result calls for a more program-oriented SMT model that is capable of better integrating the syntactic and semantic information of a program to support language migration.