Learning to Parse Natural Language with Maximum Entropy Models
Machine Learning - Special issue on natural language learning
Head-driven statistical models for natural language parsing
Head-driven statistical models for natural language parsing
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
TnT: a statistical part-of-speech tagger
ANLC '00 Proceedings of the sixth conference on Applied natural language processing
A maximum entropy approach to identifying sentence boundaries
ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Inside-outside reestimation from partially bracketed corpora
ACL '92 Proceedings of the 30th annual meeting on Association for Computational Linguistics
Anchor text mining for translation of Web queries: A transitive translation approach
ACM Transactions on Information Systems (TOIS)
Efficient parsing for bilexical context-free grammars and head automaton grammars
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Immediate-head parsing for language models
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Feature-rich part-of-speech tagging with a cyclic dependency network
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Enriching the knowledge sources used in a maximum entropy part-of-speech tagger
EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
Corpus-based induction of syntactic structure: models of dependency and constituency
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Coarse-to-fine n-best parsing and MaxEnt discriminative reranking
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Effective self-training for parsing
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Wikify!: linking documents to encyclopedic knowledge
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Unsupervised query segmentation using generative language models and wikipedia
Proceedings of the 17th international conference on World Wide Web
The Unreasonable Effectiveness of Data
IEEE Intelligent Systems
CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
Automatic prediction of parser accuracy
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
The linguistic structure of English web-search queries
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Shared logistic normal distributions for soft parameter tying in unsupervised grammar induction
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Improving unsupervised dependency parsing with richer contexts and smoothing
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Joint parsing and named entity recognition
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Computing semantic relatedness using Wikipedia-based explicit semantic analysis
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Semi-supervised learning of dependency parsers using generalized expectation criteria
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Distant supervision for relation extraction without labeled data
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
WikiWalk: random walks on Wikipedia for semantic relatedness
TextGraphs-4 Proceedings of the 2009 Workshop on Graph-based Methods for Natural Language Processing
From baby steps to Leapfrog: how "Less is More" in unsupervised dependency parsing
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Viterbi training improves unsupervised dependency parsing
CoNLL '10 Proceedings of the Fourteenth Conference on Computational Natural Language Learning
Using a partially annotated corpus to build a dependency parser for japanese
IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
Viterbi training improves unsupervised dependency parsing
CoNLL '10 Proceedings of the Fourteenth Conference on Computational Natural Language Learning
Unsupervised induction of tree substitution grammars for dependency parsing
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Covariance in Unsupervised Learning of Probabilistic Grammars
The Journal of Machine Learning Research
Neutralizing linguistically problematic annotations in unsupervised dependency parsing evaluation
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Punctuation: making a point in unsupervised dependency parsing
CoNLL '11 Proceedings of the Fifteenth Conference on Computational Natural Language Learning
Reducing the size of the representation for the uDOP-estimate
EMNLP '11 Proceedings of the First Workshop on Unsupervised Learning in NLP
Enhancing Chinese word segmentation using unlabeled data
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Lateen EM: unsupervised training with multiple objectives, applied to dependency grammar induction
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Fast unsupervised dependency parsing with arc-standard transitions
ROBUS-UNSUP '12 Proceedings of the Joint Workshop on Unsupervised and Semi-Supervised Learning in NLP
Capitalization cues improve dependency grammar induction
WILS '12 Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure
Hi-index | 0.00 |
We show how web mark-up can be used to improve unsupervised dependency parsing. Starting from raw bracketings of four common HTML tags (anchors, bold, italics and underlines), we refine approximate partial phrase boundaries to yield accurate parsing constraints. Conversion procedures fall out of our linguistic analysis of a newly available million-word hyper-text corpus. We demonstrate that derived constraints aid grammar induction by training Klein and Manning's Dependency Model with Valence (DMV) on this data set: parsing accuracy on Section 23 (all sentences) of the Wall Street Journal corpus jumps to 50.4%, beating previous state-of-the-art by more than 5%. Web-scale experiments show that the DMV, perhaps because it is unlexicalized, does not benefit from orders of magnitude more annotated but noisier data. Our model, trained on a single blog, generalizes to 53.3% accuracy out-of-domain, against the Brown corpus --- nearly 10% higher than the previous published best. The fact that web mark-up strongly correlates with syntactic structure may have broad applicability in NLP.