Statistical Language Learning
Towards High Speed Grammar Induction on Large Text Corpora
SOFSEM '00 Proceedings of the 27th Conference on Current Trends in Theory and Practice of Informatics
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Dependence language model for information retrieval
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
A generative constituent-context model for improved grammar induction
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Unsupervised learning of dependency structure for language modeling
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Inside-outside reestimation from partially bracketed corpora
HLT '91 Proceedings of the workshop on Speech and Natural Language
Introduction to the CoNLL-2000 shared task: chunking
ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Corpus-based induction of syntactic structure: models of dependency and constituency
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Annealing techniques for unsupervised statistical language learning
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Comparing and combining finite-state and context-free parsers
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Probabilistic Models for Action-Based Chinese Dependency Parsing
ECML '07 Proceedings of the 18th European conference on Machine Learning
Limitations of current grammar induction algorithms
ACL '07 Proceedings of the 45th Annual Meeting of the ACL: Student Research Workshop
Unsupervised parsing with U-DOP
CoNLL-X '06 Proceedings of the Tenth Conference on Computational Natural Language Learning
Shared logistic normal distributions for soft parameter tying in unsupervised grammar induction
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Improving unsupervised dependency parsing with richer contexts and smoothing
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Broad-coverage parsing using human-like memory constraints
Computational Linguistics
Painless unsupervised learning with features
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
From baby steps to Leapfrog: how "Less is More" in unsupervised dependency parsing
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Improvements in unsupervised co-occurrence based parsing
CoNLL '10 Proceedings of the Fourteenth Conference on Computational Natural Language Learning
Crouching Dirichlet, hidden Markov model: unsupervised POS tagging with context local tag generation
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Improved fully unsupervised parsing with zoomed learning
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Simple Unsupervised Identification of Low-Level Constituents
ICSC '10 Proceedings of the 2010 IEEE Fourth International Conference on Semantic Computing
Reducing the size of the representation for the uDOP-estimate
EMNLP '11 Proceedings of the First Workshop on Unsupervised Learning in NLP
Capitalization cues improve dependency grammar induction
WILS '12 Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure
A feature-rich constituent context model for grammar induction
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
Smoothing for bracketing induction
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Bayesian Constituent Context Model for Grammar Induction
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)
Hi-index | 0.00 |
We consider a new subproblem of unsupervised parsing from raw text, unsupervised partial parsing---the unsupervised version of text chunking. We show that addressing this task directly, using probabilistic finite-state methods, produces better results than relying on the local predictions of a current best unsu-pervised parser, Seginer's (2007) CCL. These finite-state models are combined in a cascade to produce more general (full-sentence) constituent structures; doing so outperforms CCL by a wide margin in unlabeled PARSEVAL scores for English, German and Chinese. Finally, we address the use of phrasal punctuation as a heuristic indicator of phrasal boundaries, both in our system and in CCL.