Fundamentals of speech recognition
Fundamentals of speech recognition
Class-based n-gram models of natural language
Computational Linguistics
Selective Sampling Using the Query by Committee Algorithm
Machine Learning
Statistical methods for speech recognition
Statistical methods for speech recognition
Active Learning for Natural Language Parsing and Information Extraction
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Employing EM and Pool-Based Active Learning for Text Classification
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
A new statistical parser based on bigram lexical dependencies
ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Applying co-training methods to statistical parsing
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Decision tree parsing using a hidden derivation model
HLT '94 Proceedings of the workshop on Human Language Technology
Sample selection for statistical grammar induction
EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
On minimizing training corpus for parser acquisition
ConLL '01 Proceedings of the 2001 workshop on Computational Natural Language Learning - Volume 7
Parser adaptation via Householder transform
ICASSP '00 Proceedings of the Acoustics, Speech, and Signal Processing, 2000. on IEEE International Conference - Volume 02
Active learning with statistical models
Journal of Artificial Intelligence Research
Statistical parsing with a context-free grammar and word statistics
AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
Active learning using pre-clustering
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Example selection for bootstrapping statistical parsers
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Bootstrapping parsers via syntactic projection across parallel texts
Natural Language Engineering
Sample Selection for Statistical Parsing
Computational Linguistics
Updating an NLP system to fit new domains: an empirical study on the sentence segmentation problem
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
High-quality speech-to-speech translation for computer-aided language learning
ACM Transactions on Speech and Language Processing (TSLP)
An active approach to spoken language processing
ACM Transactions on Speech and Language Processing (TSLP)
Multi-criteria-based active learning for named entity recognition
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
An empirical study of the behavior of active learning for word sense disambiguation
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Active learning for logistic regression: an evaluation
Machine Learning
Active learning and logarithmic opinion pools for hpsg parse selection
Natural Language Engineering
Improving supervised learning performance by using fuzzy clustering method to select training data
Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology - Fuzzy theory and technology with applications
The bootstrapping of the Yarowsky algorithm in real corpora
Information Processing and Management: an International Journal
A Density-Based Re-ranking Technique for Active Learning for Data Annotations
ICCPOL '09 Proceedings of the 22nd International Conference on Computer Processing of Oriental Languages. Language Technology for the Knowledge-based Economy
Address standardization with latent semantic association
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Active learning with confidence
HLT-Short '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers
Active learning for anaphora resolution
HLT '09 Proceedings of the NAACL HLT 2009 Workshop on Active Learning for Natural Language Processing
Sample selection for statistical parsers: cognitively driven algorithms and evaluation measures
CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
Multi-criteria-based strategy to stop active learning for data annotation
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
A weakly supervised learning approach for spoken language understanding
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Semi-automatic entity set refinement
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Active Zipfian sampling for statistical parser training
NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
Spoken language understanding using weakly supervised learning
Computer Speech and Language
A two-stage method for active learning of statistical grammars
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Confidence-based stopping criteria for active learning for data annotation
ACM Transactions on Speech and Language Processing (TSLP)
Using smaller constituents rather than sentences in active learning for Japanese dependency parsing
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Phrase-based statistical language generation using graphical models and active learning
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Active semi-supervised learning for improving word alignment
ALNLP '10 Proceedings of the NAACL HLT 2010 Workshop on Active Learning for Natural Language Processing
Active learning with sampling by uncertainty and density for data annotations
IEEE Transactions on Audio, Speech, and Language Processing
Clustering-based stratified seed sampling for semi-supervised relation classification
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Discriminative sample selection for statistical machine translation
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Initial training data selection for active learning
Proceedings of the 5th International Conference on Ubiquitous Information Management and Communication
Coached active learning for interactive video search
MM '11 Proceedings of the 19th ACM international conference on Multimedia
Uncertainty-based active learning with instability estimation for text classification
ACM Transactions on Speech and Language Processing (TSLP)
Using a partially annotated corpus to build a dependency parser for japanese
IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
Applying active learning to assertion classification of concepts in clinical text
Journal of Biomedical Informatics
EGAL: exploration guided active learning for TCBR
ICCBR'10 Proceedings of the 18th international conference on Case-Based Reasoning Research and Development
Active learning for dependency parsing using partially annotated sentences
IWPT '11 Proceedings of the 12th International Conference on Parsing Technologies
The latent words language model
Computer Speech and Language
An adaptive approach with active learning in software fault prediction
Proceedings of the 8th International Conference on Predictive Models in Software Engineering
A graph-based approach to commonsense concept extraction and semantic similarity detection
Proceedings of the 22nd international conference on World Wide Web companion
Hi-index | 0.00 |
It is necessary to have a (large) annotated corpus to build a statistical parser. Acquisition of such a corpus is costly and time-consuming. This paper presents a method to reduce this demand using active learning, which selects what samples to annotate, instead of annotating blindly the whole training corpus.Sample selection for annotation is based upon "representativeness" and "usefulness". A model-based distance is proposed to measure the difference of two sentences and their most likely parse trees. Based on this distance, the active learning process analyzes the sample distribution by clustering and calculates the density of each sample to quantify its representativeness. Further more, a sentence is deemed as useful if the existing model is highly uncertain about its parses, where uncertainty is measured by various entropy-based scores.Experiments are carried out in the shallow semantic parser of an air travel dialog system. Our result shows that for about the same parsing accuracy, we only need to annotate a third of the samples as compared to the usual random selection method.