A cross-lingual framework for web news taxonomy integration

  • Authors:
  • Cheng-Zen Yang;Che-Min Chen;Ing-Xiang Chen

  • Affiliations:
  • Department of Computer Science and Engineering, Yuan Ze University, Taiwan, R.O.C.;Department of Computer Science and Engineering, Yuan Ze University, Taiwan, R.O.C.;Department of Computer Science and Engineering, Yuan Ze University, Taiwan, R.O.C.

  • Venue:
  • AIRS'06 Proceedings of the Third Asia conference on Information Retrieval Technology
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

There are currently many news sites providing online news articles, and many Web news portals arise to provide clustered news categories for users to browse more related news reports and realize the news events in depth. However, to the best of our knowledge, most Web news portals only provide monolingual news clustering services. In this paper, we study the cross-lingual Web news taxonomy integration problem in which news articles of the same news event reported in different languages are to be integrated into one category. Our study is based on cross-lingual classification research results and the cross-training concept to construct SVM-based classifiers for cross-lingual Web news taxonomy integration. We have conducted several experiments with the news articles from Google News as the experimental data sets. From the experimental results, we find that the proposed cross-training classifiers outperforms the traditional SVM classifiers in an all-round manner. We believe that the proposed framework can be applied to different bilingual environments.