Communications of the ACM - Special issue on parallelism
Information-based syntax and semantics: Vol. 1: fundamentals
Information-based syntax and semantics: Vol. 1: fundamentals
Instance-Based Learning Algorithms
Machine Learning
C4.5: programs for machine learning
C4.5: programs for machine learning
A corpus-based approach to language learning
A corpus-based approach to language learning
Forgetting Exceptions is Harmful in Language Learning
Machine Learning - Special issue on natural language learning
Information Retrieval
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
A stochastic parts program and noun phrase parser for unrestricted text
ANLC '88 Proceedings of the second conference on Applied natural language processing
A maximum-entropy-inspired parser
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
A memory-based approach to learning shallow natural language patterns
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
A new statistical parser based on bigram lexical dependencies
ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Three new probabilistic models for dependency parsing: an exploration
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Scaling to very very large corpora for natural language disambiguation
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Introduction to the CoNLL-2000 shared task: chunking
ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Single-classifier memory-based phrase chunking
ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Exploring evidence for shallow parsing
ConLL '01 Proceedings of the 2001 workshop on Computational Natural Language Learning - Volume 7
Preposition semantic classification via Penn Treebank and FrameNet
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
A mission for computational natural language learning
CoNLL-X '06 Proceedings of the Tenth Conference on Computational Natural Language Learning
Unsupervised Part-of-Speech Tagging in the Large
Research on Language and Computation
New meta-grammar constructs in czech language parser synt
TSD'05 Proceedings of the 8th international conference on Text, Speech and Dialogue
Hi-index | 0.00 |
We describe a case study in which a memory-based learning algorithm is trained to simultaneously chunk sentences and assign grammatical function tags to these chunks. We compare the algorithm's performance on this parsing task with varying training set sizes (yielding learning curves) and different input representations. In particular we compare input consisting of words only, a variant that includes word form information for low-frequency words, gold-standard POS only, and combinations of these. The word-based shallow parser displays an apparently log-linear increase in performance, and surpasses the flatter POS-based curve at about 50,000 sentences of training data. The low-frequency variant performs even better, and the combinations is best. Comparative experiments with a real POS tagger produce lower results. We argue that we might not need an explicit intermediate POS-tagging step for parsing when a sufficient amount of training material is available and word form information is used for low-frequency words.