Title generation for machine-translated documents

  • Authors:
  • Rong Jin;Alexander G. Hauptmann

  • Affiliations:
  • Language Technologies Institute, Carnegie Mellon University, Pittsburgh, PA;Dept. of Computer Science, Carnegie Mellon University, Pittsburgh, PA

  • Venue:
  • IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we present and compare automatically generated titles for machine-translated documents using several different statistics-based methods. A Naïve Bayesian, a K-Nearest Neighbour, a TF-IDF and an iterative Expectation-Maximization method for title generation were applied to 1000 original English news documents and again to the same documents translated from English into Portuguese, French or German and back to English using SYSTRAN. The AutoSummarization function of Microsoft Word was used as a base line. Results on several metrics show that the statistics-based methods of title generation for machine-translated documents are fairly language independent and title generation is possible at a level approaching the accuracy of titles generated for the original English documents.