"cba to check the spelling" investigating parser performance on discussion forum posts

Authors:
Jennifer Foster
Affiliations:
Dublin City University
Venue:
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Year:
2010

Citing 6
Cited 11

Intricacies of Collins' Parsing Model

Computational Linguistics
Coarse-to-fine n-best parsing and MaxEnt discriminative reranking

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Reranking and self-training for parser adaptation

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Learning accurate, compact, and interpretable tree annotation

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
A phrase-based statistical model for SMS text normalization

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Domain adaptation with artificial data for semantic parsing of speech

NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers

Uptraining for accurate deterministic question parsing

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Lessons learned in part-of-speech tagging of conversational speech

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Affect analysis model: Novel rule-based approach to affect sensing from text

Natural Language Engineering
Syntactic error detection and correction in date expressions using finite-state transducers

Natural Language Engineering
Disentangling chat with local coherence models

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Experiments with artificially generated noise for cleansing noisy text

Proceedings of the 2011 Joint Workshop on Multilingual OCR and Analytics for Noisy Unstructured Text Data
Adapting a WSJ trained part-of-speech tagger to noisy text: preliminary results

Proceedings of the 2011 Joint Workshop on Multilingual OCR and Analytics for Noisy Unstructured Text Data
Training a parser for machine translation reordering

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
A word clustering approach to domain adaptation: effective parsing of biomedical texts

IWPT '11 Proceedings of the 12th International Conference on Parsing Technologies
Comparing the use of edited and unedited text in parser self-training

IWPT '11 Proceedings of the 12th International Conference on Parsing Technologies
Machine learning for high-quality tokenization replicating variable tokenization schemes

CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

We evaluate the Berkeley parser on text from an online discussion forum. We evaluate the parser output with and without gold tokens and spellings (using Sparseval and Parseval), and we compile a list of problematic phenomena for this domain. The Parseval f-score for a small development set is 77.56. This increases to 80.27 when we apply a set of simple transformations to the input sentences and to the Wall Street Journal (WSJ) training sections.