Handling noisy training and testing data

Authors:
Don Blaheta
Affiliations:
Brown University
Venue:
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Year:
2002

Citing 2
Cited 2

Natural language parsing as statistical pattern recognition

Natural language parsing as statistical pattern recognition
Assigning function tags to parsed text

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference

Definitional, personal, and mechanical constraints on part of speech annotation performance

Natural Language Engineering
Topic sentiment analysis in twitter: a graph-based hashtag sentiment classification approach

Proceedings of the 20th ACM international conference on Information and knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

In the field of empirical natural language processing, researchers constantly deal with large amounts of marked-up data; whether the markup is done by the researcher or someone else, human nature dictates that it will have errors in it. This paper will more fully characterise the problem and discuss whether and when (and how) to correct the errors. The discussion is illustrated with specific examples involving function tagging in the Penn treebank.