Text Classification from Labeled and Unlabeled Documents using EM
Machine Learning - Special issue on information retrieval
A systematic comparison of various statistical alignment models
Computational Linguistics
Stochastic inversion transduction grammars and bilingual parsing of parallel corpora
Computational Linguistics
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
A statistical parser for Czech
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
A generative constituent-context model for improved grammar induction
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Inducing multilingual POS taggers and NP bracketers via robust projection across aligned corpora
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Statistical phrase-based translation
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Inducing word alignments with bilexical synchronous trees
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Parsing word-aligned parallel corpora in a grammar induction context
ParaText '05 Proceedings of the ACL Workshop on Building and Using Parallel Texts
Unsupervised multilingual grammar induction
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Parser adaptation and projection with quasi-synchronous grammar features
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Multilingual part-of-speech tagging: two unsupervised approaches
Journal of Artificial Intelligence Research
Phylogenetic grammar induction
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Identifying patterns for unsupervised grammar induction
CoNLL '10 Proceedings of the Fourteenth Conference on Computational Natural Language Learning
What's with the attitude?: identifying sentences with attitude in online discussions
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Using large monolingual and bilingual corpora to improve coordination disambiguation
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Unsupervised multilingual learning
Unsupervised multilingual learning
Improving statistical word alignments with morpho-syntactic transformations
FinTAL'06 Proceedings of the 5th international conference on Advances in Natural Language Processing
Inducing sentence structure from parallel corpora for reordering
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Relaxed cross-lingual projection of constituent syntax
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Selective sharing for multilingual dependency parsing
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Crosslingual induction of semantic roles
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Hi-index | 0.00 |
This paper discusses the use of statistical word alignment over multiple parallel texts for the identification of string spans that cannot be constituents in one of the languages. This information is exploited in monolingual PCFG grammar induction for that language, within an augmented version of the inside-outside algorithm. Besides the aligned corpus, no other resources are required. We discuss an implemented system and present experimental results with an evaluation against the Penn Tree-bank.