Two Experiments on Learning Probabilistic Dependency Grammars from Corpora
Two Experiments on Learning Probabilistic Dependency Grammars from Corpora
Head-driven statistical models for natural language parsing
Head-driven statistical models for natural language parsing
Robust probabilistic predictive syntactic processing: motivations, models, and applications
Robust probabilistic predictive syntactic processing: motivations, models, and applications
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
PCFG models of linguistic tree representations
Computational Linguistics
A maximum-entropy-inspired parser
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Three generative, lexicalised models for statistical parsing
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
A sentence analysis method for a Japanese book reading machine for the blind
ACL '86 Proceedings of the 24th annual meeting on Association for Computational Linguistics
ACL '80 Proceedings of the 18th annual meeting on Association for Computational Linguistics
Inside-outside reestimation from partially bracketed corpora
ACL '92 Proceedings of the 30th annual meeting on Association for Computational Linguistics
Exploring the role of punctuation in parsing natural text
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
Text genre detection using common word frequencies
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Efficient parsing for bilexical context-free grammars and head automaton grammars
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Integer linear programming inference for conditional random fields
ICML '05 Proceedings of the 22nd international conference on Machine learning
Inducing syntactic categories by context distribution clustering
ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Parsing and disfluency placement
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Corpus-based induction of syntactic structure: models of dependency and constituency
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Effective use of prosody in parsing conversational speech
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Restoring punctuation and capitalization in transcribed speech
ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
CoNLL-X shared task on multilingual dependency parsing
CoNLL-X '06 Proceedings of the Tenth Conference on Computational Natural Language Learning
Evaluating unsupervised part-of-speech tagging for grammar induction
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
A more precise analysis of punctuation for broad-coverage surface realization with CCG
GEAF '08 Proceedings of the Workshop on Grammar Engineering Across Frameworks
Improving unsupervised dependency parsing with richer contexts and smoothing
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Joint parsing and named entity recognition
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Learning and inference over constrained output
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Parsing with soft and hard constraints on dependency length
Parsing '05 Proceedings of the Ninth International Workshop on Parsing Technology
Punctuation as implicit annotations for chinese word segmentation
Computational Linguistics
Semantic role chunking combining complementary syntactic views
CONLL '05 Proceedings of the Ninth Conference on Computational Natural Language Learning
Painless unsupervised learning with features
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Profiting from mark-up: hyper-text annotations for guided parsing
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Viterbi training improves unsupervised dependency parsing
CoNLL '10 Proceedings of the Fourteenth Conference on Computational Natural Language Learning
Better punctuation prediction with dynamic conditional random fields
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Unsupervised induction of tree substitution grammars for dependency parsing
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Integrating punctuation rules and naïve bayesian model for chinese creation title recognition
IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
Reducing the size of the representation for the uDOP-estimate
EMNLP '11 Proceedings of the First Workshop on Unsupervised Learning in NLP
Unsupervised dependency parsing without gold part-of-speech tags
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Concavity and initialization for unsupervised dependency parsing
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Fast unsupervised dependency parsing with arc-standard transitions
ROBUS-UNSUP '12 Proceedings of the Joint Workshop on Unsupervised and Semi-Supervised Learning in NLP
Capitalization cues improve dependency grammar induction
WILS '12 Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure
Exploiting reducibility in unsupervised dependency parsing
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Three dependency-and-boundary models for grammar induction
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Hi-index | 0.00 |
We show how punctuation can be used to improve unsupervised dependency parsing. Our linguistic analysis confirms the strong connection between English punctuation and phrase boundaries in the Penn Treebank. However, approaches that naively include punctuation marks in the grammar (as if they were words) do not perform well with Klein and Manning's Dependency Model with Valence (DMV). Instead, we split a sentence at punctuation and impose parsing restrictions over its fragments. Our grammar inducer is trained on the Wall Street Journal (WSJ) and achieves 59.5% accuracy out-of-domain (Brown sentences with 100 or fewer words), more than 6% higher than the previous best results. Further evaluation, using the 2006/7 CoNLL sets, reveals that punctuation aids grammar induction in 17 of 18 languages, for an overall average net gain of 1.3%. Some of this improvement is from training, but more than half is from parsing with induced constraints, in inference. Punctuation-aware decoding works with existing (even already-trained) parsing models and always increased accuracy in our experiments.