Elements of information theory
Elements of information theory
A sequential algorithm for training text classifiers
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Improving Generalization with Active Learning
Machine Learning - Special issue on structured connectionist systems
Selective Sampling Using the Query by Committee Algorithm
Machine Learning
Journal of the ACM (JACM)
Active Learning for Natural Language Parsing and Information Extraction
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Learning probabilistic lexicalized grammars for natural language processing
Learning probabilistic lexicalized grammars for natural language processing
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
Selective sampling for example-based word sense disambiguation
Computational Linguistics
Three generative, lexicalised models for statistical parsing
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Learning parse and translation decisions from examples with rich context
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
An empirical evaluation of Probabilistic Lexicalized Tree Insertion Grammars
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Minimizing manual annotation cost in supervised training from corpora
ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Inside-outside reestimation from partially bracketed corpora
ACL '92 Proceedings of the 30th annual meeting on Association for Computational Linguistics
Supervised grammar induction using training data with limited constituent information
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Statistical parsing with a context-free grammar and word statistics
AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
DUSTer: A Method for Unraveling Cross-Language Divergences for Statistical Word-Level Alignment
AMTA '02 Proceedings of the 5th Conference of the Association for Machine Translation in the Americas on Machine Translation: From Research to Real Users
Facilitating treebank annotation using a statistical parser
HLT '01 Proceedings of the first international conference on Human language technology research
Active learning for statistical natural language parsing
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Evaluating translational correspondence using annotation projection
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
An empirical study of active learning with support vector machines for Japanese word segmentation
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Applying co-training methods to statistical parsing
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Example selection for bootstrapping statistical parsers
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Integrated shallow and deep parsing: TopP meets HPSG
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Sample Selection for Statistical Parsing
Computational Linguistics
On minimizing training corpus for parser acquisition
ConLL '01 Proceedings of the 2001 workshop on Computational Natural Language Learning - Volume 7
Active learning for HPSG parse selection
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Gaussian fields for semi-supervised regression and correspondence learning
Pattern Recognition
Active learning and logarithmic opinion pools for hpsg parse selection
Natural Language Engineering
Estimating annotation cost for active learning in a multi-annotator environment
HLT '09 Proceedings of the NAACL HLT 2009 Workshop on Active Learning for Natural Language Processing
Evaluating automation strategies in language documentation
HLT '09 Proceedings of the NAACL HLT 2009 Workshop on Active Learning for Natural Language Processing
Efficient annotation with the Jena ANnotation Environment (JANE)
LAW '07 Proceedings of the Linguistic Annotation Workshop
A two-stage method for active learning of statistical grammars
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Centrality Measures from Complex Networks in Active Learning
DS '09 Proceedings of the 12th International Conference on Discovery Science
Tag confidence measure for semi-automatically updating named entity recognition
NEWS '09 Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration
Confidence-based stopping criteria for active learning for data annotation
ACM Transactions on Speech and Language Processing (TSLP)
Bucking the trend: large-scale cost-focused active learning for statistical machine translation
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Active learning for dialogue act labelling
IbPRIA'11 Proceedings of the 5th Iberian conference on Pattern recognition and image analysis
Margin-Based active learning for structured output spaces
ECML'06 Proceedings of the 17th European conference on Machine Learning
Uncertainty-based active learning with instability estimation for text classification
ACM Transactions on Speech and Language Processing (TSLP)
A graph-based approach to commonsense concept extraction and semantic similarity detection
Proceedings of the 22nd international conference on World Wide Web companion
Hi-index | 0.00 |
Corpus-based grammar induction relies on using many hand-parsed sentences as training examples. However, the construction of a training corpus with detailed syntactic analysis for every sentence is a labor-intensive task. We propose to use sample selection methods to minimize the amount of annotation needed in the training data, thereby reducing the workload of the human annotators. This paper shows that the amount of annotated training data can be reduced by 36% without degrading the quality of the induced grammars.