A corpus-based approach to language learning
A corpus-based approach to language learning
Light parsing as finite state filtering
Extended finite state models of language
Information Retrieval
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
A stochastic parts program and noun phrase parser for unrestricted text
ANLC '88 Proceedings of the second conference on Applied natural language processing
A memory-based approach to learning shallow natural language patterns
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
An algorithm for finding noun phrase correspondences in bilingual corpora
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
A computational model of language performance: Data Oriented Parsing
COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 3
Surface grammatical analysis for the extraction of terminological noun phrases
COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 3
Machine learning-based named entity recognition via effective integration of various evidences
Natural Language Engineering
Named entity recognition using an HMM-based chunk tagger
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Coreference resolution using competition learning approach
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
SIGHAN '03 Proceedings of the second SIGHAN workshop on Chinese language processing - Volume 17
Chunking-based Chinese word tokenization
SIGHAN '03 Proceedings of the second SIGHAN workshop on Chinese language processing - Volume 17
Discriminative hidden Markov modeling with long state dependence using a kNN ensemble
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
A high-performance coreference resolution system using a constraint-based multi-agent strategy
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
A robust multilingual portable phrase chunking system
Expert Systems with Applications: An International Journal
Efficient text chunking using linear kernel with masked method
Knowledge-Based Systems
A twin-candidate model for learning-based anaphora resolution
Computational Linguistics
Global learning of noun phrase anaphoricity in coreference resolution via label propagation
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Direct modelling of output context dependence in discriminative hidden Markov model
Pattern Recognition Letters
Learning noun phrase anaphoricity in coreference resolution via label propagation
Journal of Computer Science and Technology - Special issue on natural language processing
Improving noun phrase coreference resolution by matching strings
IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing
Hi-index | 0.00 |
This paper proposes a new error-driven HMM-based text chunk tagger with context-dependent lexicon. Compared with standard HMM-based tagger, this tagger uses a new Hidden Markov Modelling approach which incorporates more contextual information into a lexical entry. Moreover, an error-driven learning approach is adopted to decrease the memory requirement by keeping only positive lexical entries and makes it possible to further incorporate more context-dependent lexical entries. Experiments show that this technique achieves overall precision and recall rates of 93.40% and 93.95% for all chunk types, 93.60% and 94.64% for noun phrases, and 94.64% and 94.75% for verb phrases when trained on PENN WSJ TreeBank section 00-19 and tested on section 20-24, while 25-fold validation experiments of PENN WSJ TreeBank show overall precision and recall rates of 96.40% and 96.47% for all chunk types, 96.49% and 96.99% for noun phrases, and 97.13% and 97.36% for verb phrases.