Empirical methods for artificial intelligence
Empirical methods for artificial intelligence
Artificial Intelligence - Special volume on empirical methods
A Maximum-Entropy-Inspired Parser
A Maximum-Entropy-Inspired Parser
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
Coarse-to-fine n-best parsing and MaxEnt discriminative reranking
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
The importance of supertagging for wide-coverage CCG parsing
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Bidirectional inference with the easiest-first strategy for tagging sequence data
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
The GENIA corpus: an annotated research abstract corpus in molecular biology domain
HLT '02 Proceedings of the second international conference on Human Language Technology Research
Parsing '05 Proceedings of the Ninth International Workshop on Parsing Technology
IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
Adapting a probabilistic disambiguation model of an HPSG parser to a new domain
IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Inductive probabilistic taxonomy learning using singular value decomposition
Natural Language Engineering
Machine learning for high-quality tokenization replicating variable tokenization schemes
CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I
Automatic extraction of function-behaviour-state information from patents
Advanced Engineering Informatics
Hi-index | 0.00 |
We aim to improve the performance of a syntactic parser that uses a part-of-speech (POS) tagger as a preprocessor. Pipelined parsers consisting of POS taggers and syntactic parsers have several advantages, such as the capability of domain adaptation. However the performance of such systems on raw texts tends to be disappointing as they are affected by the errors of automatic POS tagging. We attempt to compensate for the decrease in accuracy caused by automatic taggers by allowing the taggers to output multiple answers when the tags cannot be determined reliably enough. We empirically verify the effectiveness of the method using an HPSG parser trained on the Penn Treebank. Our results show that ambiguous POS tagging improves parsing if outputs of taggers are weighted by probability values, and the results support previous studies with similar intentions. We also examine the effectiveness of our method for adapting the parser to the GENIA corpus and show that the use of ambiguous POS taggers can help development of portable parsers while keeping accuracy high.