Principal component neural networks: theory and applications
Principal component neural networks: theory and applications
Dictionary Methods for Cross-Lingual Information Retrieval
DEXA '96 Proceedings of the 7th International Conference on Database and Expert Systems Applications
The Journal of Machine Learning Research
An IR approach for translating new words from nonparallel, comparable texts
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Automatic identification of word translations from unrelated English and German corpora
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Improving Machine Translation Performance by Exploiting Non-Parallel Corpora
Computational Linguistics
Thumbs up?: sentiment classification using machine learning techniques
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
NRC's PORTAGE system for WMT 2007
StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
On smoothing and inference for topic models
UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
Cross-lingual latent topic extraction
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Probabilistic latent semantic analysis
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Extracting multilingual topics from unaligned comparable corpora
ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval
Rich prior knowledge in learning for NLP
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts of ACL 2011
From bilingual dictionaries to interlingual document representations
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Clickthrough-based latent semantic models for web search
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
No free lunch: brute force vs. locality-sensitive hashing for cross-lingual pairwise similarity
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Learning discriminative projections for text similarity measures
CoNLL '11 Proceedings of the Fifteenth Conference on Computational Natural Language Learning
A cross-domain adaptation method for sentiment classification using probabilistic latent analysis
Proceedings of the 20th ACM international conference on Information and knowledge management
Improving bilingual projections via sparse covariance matrices
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Cross-language information retrieval with latent topic models trained on a comparable corpus
AIRS'11 Proceedings of the 7th Asia conference on Information Retrieval Technology
Learning hash functions for cross-view similarity search
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two
Polarity inducing latent semantic analysis
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Question-answer topic model for question retrieval in community question answering
Proceedings of the 21st ACM international conference on Information and knowledge management
Cross-Language high similarity search using a conceptual thesaurus
CLEF'12 Proceedings of the Third international conference on Information Access Evaluation: multilinguality, multimodality, and visual analytics
Bidirectional semi-supervised learning with graphs
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
Efficient Nearest-Neighbor Search in the Probability Simplex
Proceedings of the 2013 Conference on the Theory of Information Retrieval
Learning deep structured semantic models for web search using clickthrough data
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Joint and coupled bilingual topic model based sentence representations for language model adaptation
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Hi-index | 0.00 |
Representing documents by vectors that are independent of language enhances machine translation and multilingual text categorization. We use discriminative training to create a projection of documents from multiple languages into a single translingual vector space. We explore two variants to create these projections: Oriented Principal Component Analysis (OPCA) and Coupled Probabilistic Latent Semantic Analysis (CPLSA). Both of these variants start with a basic model of documents (PCA and PLSA). Each model is then made discriminative by encouraging comparable document pairs to have similar vector representations. We evaluate these algorithms on two tasks: parallel document retrieval for Wikipedia and Europarl documents, and cross-lingual text classification on Reuters. The two discriminative variants, OPCA and CPLSA, significantly outperform their corresponding baselines. The largest differences in performance are observed on the task of retrieval when the documents are only comparable and not parallel. The OPCA method is shown to perform best.