Inducing Probabilistic Grammars by Bayesian Model Merging
ICGI '94 Proceedings of the Second International Colloquium on Grammatical Inference and Applications
Stochastic inversion transduction grammars and bilingual parsing of parallel corpora
Computational Linguistics
A generative constituent-context model for improved grammar induction
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Multitext Grammars and synchronous parsers
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Learning non-isomorphic tree mappings for machine translation
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 2
Learning Gaussian processes from multiple tasks
ICML '05 Proceedings of the 22nd international conference on Machine learning
Corpus-based induction of syntactic structure: models of dependency and constituency
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
A hierarchical phrase-based model for statistical machine translation
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Shared logistic normal distributions for soft parameter tying in unsupervised grammar induction
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Unsupervised multilingual grammar induction
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Bayesian multitask learning with latent hierarchies
UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
Semantic separator learning and its applications in unsupervised Chinese text parsing
Frontiers of Computer Science: Selected Publications from Chinese Universities
Hi-index | 0.00 |
We propose a corpus-based probabilistic framework to extract hidden common syntax across languages from non-parallel multilingual corpora in an unsupervised fashion. For this purpose, we assume a generative model for multilingual corpora, where each sentence is generated from a language dependent probabilistic context-free grammar (PCFG), and these PCFGs are generated from a prior grammar that is common across languages. We also develop a variational method for efficient inference. Experiments on a non-parallel multilingual corpus of eleven languages demonstrate the feasibility of the proposed method.