Automatic retrieval and clustering of similar words
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Measures of distributional similarity
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
BLEU: a method for automatic evaluation of machine translation
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Automatic evaluation of summaries using N-gram co-occurrence statistics
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Reliable measures for aligning Japanese-English news articles and sentences
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
The SMART Retrieval System—Experiments in Automatic Document Processing
The SMART Retrieval System—Experiments in Automatic Document Processing
Paraphrasing rules for automatic evaluation of translation into Japanese
PARAPHRASE '03 Proceedings of the second international workshop on Paraphrasing - Volume 16
ParaEval: using paraphrases to evaluate summaries automatically
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Paraphrasing for automatic evaluation
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
An automatic method for summary evaluation using multiple evaluation results by a manual method
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Hi-index | 0.00 |
The evaluation of computer-produced texts has been recognized as an important research problem for automatic text summarization and machine translation. Traditionally, computer-produced texts were evaluated automatically by ngram overlap with human-produced texts. However, these methods cannot evaluate texts correctly, if the n-grams do not overlap between computer-produced and human-produced texts, even though the two texts convey the same meaning. We explored the use of paraphrases for the refinement of traditional automatic methods for text evaluation. To confirm the effectiveness of our method, we conducted some experiments using the data from the Text Summarization Challenge 2. We found that the use of paraphrases created using a statistical machine translation technique could improve the traditional evaluation method.