Rule writing or annotation: cost-efficient resource usage for base noun phrase chunking
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Sample selection for statistical grammar induction
EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
Active learning and logarithmic opinion pools for hpsg parse selection
Natural Language Engineering
Assessing the costs of sampling methods in active learning for annotation
HLT-Short '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers
An analysis of active learning strategies for sequence labeling tasks
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
IGT-XML: an XML format for interlinearized glossed texts
LAW '07 Proceedings of the Linguistic Annotation Workshop
Investigating the effects of selective sampling on the annotation task
CONLL '05 Proceedings of the Ninth Conference on Computational Natural Language Learning
Natural language processing and linguistic fieldwork
Computational Linguistics
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Unsupervised morphological segmentation and clustering with document boundaries
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Modeling and encoding traditional wordlists for machine applications
NLPLING '10 Proceedings of the 2010 Workshop on NLP and Linguistics: Finding the Common Ground
Crouching Dirichlet, hidden Markov model: unsupervised POS tagging with context local tag generation
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Hi-index | 0.00 |
This paper presents pilot work integrating machine labeling and active learning with human annotation of data for the language documentation task of creating interlinearized gloss text (IGT) for the Mayan language Uspanteko. The practical goal is to produce a totally annotated corpus that is as accurate as possible given limited time for manual annotation. We describe ongoing pilot studies which examine the influence of three main factors on reducing the time spent to annotate IGT: suggestions from a machine labeler, sample selection methods, and annotator expertise.