On the use of words and n-grams for Chinese information retrieval
IRAL '00 Proceedings of the fifth international workshop on on Information retrieval with Asian languages
Proceedings of the 10th international conference on World Wide Web
Automatic thesaurus generation for Chinese documents
Journal of the American Society for Information Science and Technology
Adaptive Automatic Classification on the Web
DEXA '00 Proceedings of the 11th International Workshop on Database and Expert Systems Applications
Cross-training: learning probabilistic mappings between topics
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Chinese word segmentation and its effect on information retrieval
Information Processing and Management: an International Journal
Web taxonomy integration using support vector machines
Proceedings of the 13th international conference on World Wide Web
Resource selection for domain-specific cross-lingual IR
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Web taxonomy integration through co-bootstrapping
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Clustering and visualization in a multi-lingual multi-document summarization system
ECIR'03 Proceedings of the 25th European conference on IR research
AIRS'05 Proceedings of the Second Asia conference on Asia Information Retrieval Technology
An iterative approach for web catalog integration with support vector machines
AIRS'05 Proceedings of the Second Asia conference on Asia Information Retrieval Technology
Hi-index | 0.00 |
There are currently many news sites providing online news articles, and many Web news portals arise to provide clustered news categories for users to browse more related news reports and realize the news events in depth. However, to the best of our knowledge, most Web news portals only provide monolingual news clustering services. In this paper, we study the cross-lingual Web news taxonomy integration problem in which news articles of the same news event reported in different languages are to be integrated into one category. Our study is based on cross-lingual classification research results and the cross-training concept to construct SVM-based classifiers for cross-lingual Web news taxonomy integration. We have conducted several experiments with the news articles from Google News as the experimental data sets. From the experimental results, we find that the proposed cross-training classifiers outperforms the traditional SVM classifiers in an all-round manner. We believe that the proposed framework can be applied to different bilingual environments.