Automatic retrieval and clustering of similar words
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Extracting paraphrases from a parallel corpus
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Extracting structural paraphrases from aligned monolingual corpora
PARAPHRASE '03 Proceedings of the second international workshop on Paraphrasing - Volume 16
Paraphrasing with bilingual parallel corpora
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
/*icomment: bugs or bad comments?*/
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
An approach to detecting duplicate bug reports using natural language and execution information
Proceedings of the 30th international conference on Software engineering
Identifying Word Relations in Software: A Comparative Study of Semantic Similarity Tools
ICPC '08 Proceedings of the 2008 The 16th IEEE International Conference on Program Comprehension
A discriminative model approach for accurate duplicate bug report retrieval
Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1
A corpus-based method for extracting paraphrases of emotion terms
CAAGET '10 Proceedings of the NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text
A survey of paraphrasing and textual entailment methods
Journal of Artificial Intelligence Research
Finding relevant answers in software forums
ASE '11 Proceedings of the 2011 26th IEEE/ACM International Conference on Automated Software Engineering
Identifying Linux bug fixing patches
Proceedings of the 34th International Conference on Software Engineering
Terminological paraphrase extraction from scientific literature based on predicate argument tuples
Journal of Information Science
Hi-index | 0.00 |
In this paper, we study the problem of extracting technical paraphrases from a parallel software corpus, namely, a collection of duplicate bug reports. Paraphrase acquisition is a fundamental task in the emerging area of text mining for software engineering. Existing paraphrase extraction methods are not entirely suitable here due to the noisy nature of bug reports. We propose a number of techniques to address the noisy data problem. The empirical evaluation shows that our method significantly improves an existing method by up to 58%.