Learning in the presence of malicious errors
STOC '88 Proceedings of the twentieth annual ACM symposium on Theory of computing
Decision theoretic generalizations of the PAC model for neural net and other learning applications
Information and Computation
Efficient noise-tolerant learning from statistical queries
STOC '93 Proceedings of the twenty-fifth annual ACM symposium on Theory of computing
Toward Efficient Agnostic Learning
Machine Learning - Special issue on computational learning theory, COLT'92
Large Margin Classification Using the Perceptron Algorithm
Machine Learning - The Eleventh Annual Conference on computational Learning Theory
Machine Learning
Learning noisy perceptrons by a perceptron in polynomial time
FOCS '97 Proceedings of the 38th Annual Symposium on Foundations of Computer Science
A polynomial-time algorithm for learning noisy linear threshold functions
FOCS '96 Proceedings of the 37th Annual Symposium on Foundations of Computer Science
Shallow parsing using noisy and non-stationary training material
The Journal of Machine Learning Research
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Ranking algorithms for named-entity extraction: boosting and the voted perceptron
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Filtering-Ranking Perceptron Learning for Partial Parsing
Machine Learning
Perceptrons: An Introduction to Computational Geometry
Perceptrons: An Introduction to Computational Geometry
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Supersense tagging of unknown nouns in WordNet
EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Computer
Hardness of Learning Halfspaces with Noise
FOCS '06 Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science
New Results for Learning Noisy Parities and Halfspaces
FOCS '06 Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science
Incremental parsing with the perceptron algorithm
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Data-defined kernels for parse reranking derived from probabilistic models
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Reliability measurement without limits
Computational Linguistics
HumanJudge '08 Proceedings of the Workshop on Human Judgements in Computational Linguistics
Exploiting 'subjective' annotations
HumanJudge '08 Proceedings of the Workshop on Human Judgements in Computational Linguistics
Cheap and fast---but is it good?: evaluating non-expert annotations for natural language tasks
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
From annotator agreement to noise models
Computational Linguistics
From annotator agreement to noise models
Computational Linguistics
Some empirical evidence for annotation noise in a benchmarked dataset
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Detecting emails containing requests for action
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
To annotate more accurately or to annotate more
LAW IV '10 Proceedings of the Fourth Linguistic Annotation Workshop
Short answer assessment: establishing links between research strands
Proceedings of the Seventh Workshop on Building Educational Applications Using NLP
Hi-index | 0.00 |
It is usually assumed that the kind of noise existing in annotated data is random classification noise. Yet there is evidence that differences between annotators are not always random attention slips but could result from different biases towards the classification categories, at least for the harder-to-decide cases. Under an annotation generation model that takes this into account, there is a hazard that some of the training instances are actually hard cases with unreliable annotations. We show that these are relatively unproblematic for an algorithm operating under the 0--1 loss model, whereas for the commonly used voted perceptron algorithm, hard training cases could result in incorrect prediction on the uncontroversial cases at test time.