On the MSE robustness of batching estimators
Proceedings of the 33nd conference on Winter simulation
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Ultraconservative online algorithms for multiclass problems
The Journal of Machine Learning Research
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
Supertagging: an approach to almost parsing
Computational Linguistics
New models for improving supertag disambiguation
EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Bootstrapping statistical parsers from small datasets
EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
Statistical significance of MUC-6 results
MUC6 '95 Proceedings of the 6th conference on Message understanding
Applying co-training methods to statistical parsing
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Bootstrapping POS taggers using unlabelled data
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Discriminative Reranking for Natural Language Parsing
Computational Linguistics
Reranking and self-training for parser adaptation
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
The importance of supertagging for wide-coverage CCG parsing
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Effective self-training for parsing
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Evaluating the accuracy of an unlexicalized statistical parser on the PARC DepBank
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
CCGbank: A Corpus of CCG Derivations and Dependency Structures Extracted from the Penn Treebank
Computational Linguistics
Wide-coverage efficient statistical parsing with ccg and log-linear models
Computational Linguistics
BioNLP '07 Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Multilingual deep lexical acquisition for HPSGs via supertagging
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Adapting a lexicalized-grammar parser to contrasting domains
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Porting a lexicalized-grammar parser to the biomedical domain
Journal of Biomedical Informatics
HPSG supertagging: a sequence labeling view
IWPT '09 Proceedings of the 11th International Conference on Parsing Technologies
Some experiments on indicators of parsing complexity for lexicalized grammars
Proceedings of the COLING-2000 Workshop on Efficiency In Large-Scale Parsing Systems
Chart pruning for fast lexicalised-grammar parsing
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Efficient CCG parsing: A* versus adaptive supertagging
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Exciting and interesting: issues in the generation of binomials
UCNLG+EVAL '11 Proceedings of the UCNLG+Eval: Language Generation and Evaluation Workshop
Hi-index | 0.00 |
We propose a novel self-training method for a parser which uses a lexicalised grammar and supertagger, focusing on increasing the speed of the parser rather than its accuracy. The idea is to train the supertagger on large amounts of parser output, so that the supertagger can learn to supply the supertags that the parser will eventually choose as part of the highest-scoring derivation. Since the supertagger supplies fewer supertags overall, the parsing speed is increased. We demonstrate the effectiveness of the method using a CCG supertagger and parser, obtaining significant speed increases on newspaper text with no loss in accuracy. We also show that the method can be used to adapt the CCG parser to new domains, obtaining accuracy and speed improvements for Wikipedia and biomedical text.