UCSG shallow parser

Authors:
Guntur Bharadwaja Kumar;Kavi Narayana Murthy
Affiliations:
Department of Computer and Information Siences, University of Hyderabad, India;Department of Computer and Information Siences, University of Hyderabad, India
Venue:
CICLing'06 Proceedings of the 7th international conference on Computational Linguistics and Intelligent Text Processing
Year:
2006

Citing 19
Cited 1

Light parsing as finite state filtering

Extended finite state models of language
Principle-Based Parsing: Computation and Psycholinguistics

Principle-Based Parsing: Computation and Psycholinguistics
Introduction To Automata Theory, Languages, And Computation

Introduction To Automata Theory, Languages, And Computation
Memory-based shallow parsing

The Journal of Machine Learning Research
Shallow parsing using specialized hmms

The Journal of Machine Learning Research
Text chunking based on a generalization of winnow

The Journal of Machine Learning Research
Learning rules and their exceptions

The Journal of Machine Learning Research
A shallow parser based on closed-class words to capture relations in biomedical text

Journal of Biomedical Informatics
XTAG system: a wide coverage grammar for English

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
An integrated architecture for shallow and deep processing

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Introduction to the CoNLL-2000 shared task: chunking

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
A context sensitive maximum likelihood approach to chunking

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Chunking with maximum entropy models

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Use of support vector learning for chunk identification

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Shallow parsing as part-of-speech tagging

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Chunking with WPDV models

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Single-classifier memory-based phrase chunking

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Phrase parsing with rule sequence processors: an application to the shared CoNLL task

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Hybrid text chunking

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7

English to Hindi Paraphrase Convention for Translating Homoeopathy Literature

International Journal of Artificial Life Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recently, there is an increasing interest in integrating rule based methods with statistical techniques for developing robust, wide coverage, high performance parsing systems. In this paper, we describe an architecture, called UCSG shallow parser architecture, which combines linguistic constraints expressed in the form of finite state grammars with statistical rating using HMMs built from a POS-tagged corpus and an A* search for global optimization for determining the best shallow parse for a given sentence. The primary aim of the design of the UCSG parsing architecture is developing a judicious combination of linguistic and statistical methods to develop wide coverage robust shallow parsing systems, without the need for large scale manually parsed training corpora. The UCSG architecture uses a grammar to specify all valid structures and a statistical component to rate and rank the possible alternatives, so as to produce the best parse first without compromising on the ability to produce all possible parses. The architecture supports bootstrapping with an aim to reduce the need for parsed training corpora. The complete system has been implemented in Perl under Linux. In this paper we first describe the UCSG shallow parsing architecture and then focus on the evaluation of the UCSG finite state grammar for the chunking task for English. Recall of 91.16% and 93.73% have been obtained on the Susanne parsed corpus and CoNLL 2000 chunking task test data set respectively. Extensive experimentation is under way to evaluate the other modules.