Tagging and chunking with bigrams

Authors:
Ferran Pla;Antonio Molina;Natividad Prieto
Affiliations:
Universitat Politècnica de València;Universitat Politècnica de València;Universitat Politècnica de València
Venue:
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Year:
2000

Citing 17
Cited 7

Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging

Computational Linguistics
Part-of-Speech Tagging Using Decision Trees

ECML '98 Proceedings of the 10th European Conference on Machine Learning
Learning grammatical stucture using statistical decision-trees

ICG! '96 Proceedings of the 3rd International Colloquium on Grammatical Inference: Learning Syntax from Sentences
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
Coping with ambiguity and unknown words through probabilistic models

Computational Linguistics - Special issue on using large corpora: II
Tagging English text with a probabilistic model

Computational Linguistics
A stochastic parts program and noun phrase parser for unrestricted text

ANLC '88 Proceedings of the second conference on Applied natural language processing
Finding clauses in unrestricted text by finitary and stochastic methods

ANLC '88 Proceedings of the second conference on Applied natural language processing
A practical part-of-speech tagger

ANLC '92 Proceedings of the third conference on Applied natural language processing
Incremental finite-state parsing

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Developing a hybrid NP parser

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
A syntax-based part-of-speech analyser

EACL '95 Proceedings of the seventh conference on European chapter of the Association for Computational Linguistics
A memory-based approach to learning shallow natural language patterns

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Classifier combination for improved lexical disambiguation

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Error-driven pruning of Treebank grammars for base noun phrase identification

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Improving data driven wordclass tagging by system combination

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Surface grammatical analysis for the extraction of terminological noun phrases

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 3

Language Understanding Using Two-Level Stochastic Models with POS and Semantic Units

TSD '01 Proceedings of the 4th International Conference on Text, Speech and Dialogue
Shallow parsing using specialized hmms

The Journal of Machine Learning Research
Improving part-of-speech tagging using lexicalized HMMs

Natural Language Engineering
Improving chunking by means of lexical-contextual information in statistical language models

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Natural language tagging with genetic algorithms

Information Processing Letters
Highly accurate error-driven method for noun phrase detection

Pattern Recognition Letters
Statistical recognition of noun phrases in unrestricted text

IDA'05 Proceedings of the 6th international conference on Advances in Intelligent Data Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we present an integrated system for tagging and chunking texts from a certain language. The approach is based on stochastic finite-state models that are learnt automatically. This includes biagram models or finite-state automata learnt using grammatical inference techniques. As the models involved in our system are learnt automatically, this is a very flexible and portable system.In order to show the viability of our approach we present results for tagging and chunking using bigram models on the Wall Street Journal corpus. We have achieved an accuracy rate for tagging of 96.8%, and a precision rate for NP chunks of 94.6% with a recall rate of 93.6%.