Application of finite-state transducers to the acquisition of verb subcategorization information

Authors:
I. Aldezabal;M. Aranzabe;K. Gojenola;M. Oronoz;K. Sarasola;A. Atutxa
Affiliations:
IXA group, Department of Computer Languages and Systems, University of the Basque Country, 649 P.K., 20080-Donostia, Spain e-mail: jibalroi@si.ehu.es;IXA group, Department of Computer Languages and Systems, University of the Basque Country, 649 P.K., 20080-Donostia, Spain e-mail: jibarurm@si.ehu.es;IXA group, Department of Computer Languages and Systems, University of the Basque Country, 649 P.K., 20080-Donostia, Spain e-mail: jipgogak@si.ehu.es;IXA group, Department of Computer Languages and Systems, University of the Basque Country, 649 P.K., 20080-Donostia, Spain e-mail: jiboranm@si.ehu.es;IXA group, Department of Computer Languages and Systems, University of the Basque Country, 649 P.K., 20080-Donostia, Spain e-mail: jipsagak@si.ehu.es;Department of Linguistics, University of Maryland, College Park, MD 20742, USA e-mail: sener@wam.umd.edu
Venue:
Natural Language Engineering
Year:
2003

Citing 9
Cited 1

Foundations of statistical natural language processing

Foundations of statistical natural language processing
Constraint Grammar: A Language-Independent System for Parsing Unrestricted Text

Constraint Grammar: A Language-Independent System for Parsing Unrestricted Text
Automatic extraction of subcategorization from corpora

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Regular expressions for language engineering

Natural Language Engineering
Combining stochastic and rule-based methods for disambiguation in agglutinative languages

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Automatic acquisition of a large subcategorization dictionary from corpora

ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Automatic extraction of subcategorization frames for Czech

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Learning argument/adjunct distinction for Basque

ULA '02 Proceedings of the ACL-02 workshop on Unsupervised lexical acquisition - Volume 9
The proper treatment of optimality in computational phonology: plenary talk

FSMNLP '09 Proceedings of the International Workshop on Finite State Methods in Natural Language Processing

EusPropBank: integrating semantic information in the basque dependency treebank

CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents the design and implementation of a finite-state syntactic grammar of Basque that has been used with the objective of extracting information about verb subcategorization instances from newspaper texts. After a partial parser has built basic syntactic units such as noun phrases, prepositional phrases, and sentential complements, a finite-state parser performs syntactic disambiguation, determination of clause boundaries and filtering of the results, in order to obtain a verb occurrence together with its associated syntactic components, either complements or adjuncts. The set of occurrences for each verb is then filtered by statistical measures that distinguish arguments from adjuncts.