An introduction to support Vector Machines: and other kernel-based learning methods
An introduction to support Vector Machines: and other kernel-based learning methods
Journal of Intelligent Information Systems
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
The Perceptron Algorithm with Uneven Margins
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
An empirical study on retrieval models for different document genres: patents and newspaper articles
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
RCV1: A New Benchmark Collection for Text Categorization Research
The Journal of Machine Learning Research
Cross-language text classification
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
An EM Based Training Algorithm for Cross-Language Text Categorization
WI '05 Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence
Overview of patent retrieval task at NTCIR-3
PATENT '03 Proceedings of the ACL-2003 workshop on Patent corpus processing - Volume 20
Using KCCA for Japanese---English cross-language information retrieval and document classification
Journal of Intelligent Information Systems
Cross language text categorization by acquiring multilingual domain models from comparable corpora
ParaText '05 Proceedings of the ACL Workshop on Building and Using Parallel Texts
Support vector machine to synthesise kernels
Proceedings of the First international conference on Deterministic and Statistical Methods in Machine Learning
Can chinese web pages be classified with english data source?
Proceedings of the 17th international conference on World Wide Web
An empirical study of required dimensionality for large-scale latent semantic indexing applications
Proceedings of the 17th ACM conference on Information and knowledge management
Cross-lingual query classification: a preliminary study
Proceedings of the 2nd ACM workshop on Improving non english web searching
Cross-language query classification using web search for exogenous knowledge
Proceedings of the Second ACM International Conference on Web Search and Data Mining
Development of a multilingual text mining approach for knowledge discovery in patents
SMC'09 Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics
Patent classification system using a new hybrid genetic algorithm support vector machine
Applied Soft Computing
Cross-language text classification using structural correspondence learning
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Proceedings of the 4th International Conference on Theory and Practice of Electronic Governance
Cross-Lingual Adaptation Using Structural Correspondence Learning
ACM Transactions on Intelligent Systems and Technology (TIST)
Generalized canonical correlation analysis for disparate data fusion
Pattern Recognition Letters
A patent system ontology for facilitating retrieval of patent related information
Proceedings of the 6th International Conference on Theory and Practice of Electronic Governance
Efficiency investigation of manifold matching for text document classification
Pattern Recognition Letters
Analyzing multilingual knowledge innovation in patents
Expert Systems with Applications: An International Journal
Cross-language patent matching via an international patent classification-based concept bridge
Journal of Information Science
Hi-index | 0.00 |
We study several machine learning algorithms for cross-language patent retrieval and classification. In comparison with most of other studies involving machine learning for cross-language information retrieval, which basically used learning techniques for monolingual sub-tasks, our learning algorithms exploit the bilingual training documents and learn a semantic representation from them. We study Japanese-English cross-language patent retrieval using Kernel Canonical Correlation Analysis (KCCA), a method of correlating linear relationships between two variables in kernel defined feature spaces. The results are quite encouraging and are significantly better than those obtained by other state of the art methods. We also investigate learning algorithms for cross-language document classification. The learning algorithm are based on KCCA and Support Vector Machines (SVM). In particular, we study two ways of combining the KCCA and SVM and found that one particular combination called SVM_2k achieved better results than other learning algorithms for either bilingual or monolingual test documents.