Using KCCA for Japanese---English cross-language information retrieval and document classification

Authors:
Yaoyong Li;John Shawe-Taylor
Affiliations:
Department of Computer Science, The University of Sheffield, Sheffield, UK;ISIS Group, School of Electronics and Computer Science, University of Southampton, Southampton, UK
Venue:
Journal of Intelligent Information Systems
Year:
2006

Citing 6
Cited 11

An introduction to support Vector Machines: and other kernel-based learning methods

An introduction to support Vector Machines: and other kernel-based learning methods
Latent Semantic Kernels

Journal of Intelligent Information Systems
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

ECML '98 Proceedings of the 10th European Conference on Machine Learning
The Perceptron Algorithm with Uneven Margins

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Kernel independent component analysis

The Journal of Machine Learning Research
RCV1: A New Benchmark Collection for Text Categorization Research

The Journal of Machine Learning Research

Advanced learning algorithms for cross-language patent retrieval and classification

Information Processing and Management: an International Journal
Local dependent components

Proceedings of the 24th international conference on Machine learning
Semi-supervised Laplacian Regularization of Kernel Canonical Correlation Analysis

ECML PKDD '08 Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases - Part I
A New Canonical Correlation Analysis Algorithm with Local Discrimination

Neural Processing Letters
A refinement framework for cross language text categorization

AIRS'08 Proceedings of the 4th Asia information retrieval conference on Information retrieval technology
Transfer learning via multi-view principal component analysis

Journal of Computer Science and Technology - Special issue on natural language processing
Matching samples of multiple views

Data Mining and Knowledge Discovery
Semi-supervised kernel canonical correlation analysis with application to human fMRI

Pattern Recognition Letters
Canonical dependency analysis based on squared-loss mutual information

Neural Networks
Learning the Relative Importance of Objects from Tagged Images for Retrieval and Cross-Modal Search

International Journal of Computer Vision
Hinge loss bound approach for surrogate supervision multi-view learning

Pattern Recognition Letters

Quantified Score

Hi-index	0.01

Visualization

Abstract

Kernel Canonical Correlation Analysis (KCCA) is a method of correlating linear relationship between two variables in a kernel defined feature space. A machine learning algorithm based on KCCA is studied for cross-language information retrieval. We apply the algorithm in Japanese---English cross-language information retrieval. The results are quite encouraging and are significantly better than those obtained by other state of the art methods. Computational complexity is an important issue when applying KCCA to large dataset as in information retrieval. We experimentally evaluate several methods to alleviate the problem of applying KCCA to large datasets. We also investigate cross-language document classification using KCCA as well as other methods. Our results show that it is feasible to use a classifier learned in one language to classify the documents in other languages.