Application of finite-state transducers to the acquisition of verb subcategorization information

  • Authors:
  • I. Aldezabal;M. Aranzabe;K. Gojenola;M. Oronoz;K. Sarasola;A. Atutxa

  • Affiliations:
  • IXA group, Department of Computer Languages and Systems, University of the Basque Country, 649 P.K., 20080-Donostia, Spain e-mail: jibalroi@si.ehu.es;IXA group, Department of Computer Languages and Systems, University of the Basque Country, 649 P.K., 20080-Donostia, Spain e-mail: jibarurm@si.ehu.es;IXA group, Department of Computer Languages and Systems, University of the Basque Country, 649 P.K., 20080-Donostia, Spain e-mail: jipgogak@si.ehu.es;IXA group, Department of Computer Languages and Systems, University of the Basque Country, 649 P.K., 20080-Donostia, Spain e-mail: jiboranm@si.ehu.es;IXA group, Department of Computer Languages and Systems, University of the Basque Country, 649 P.K., 20080-Donostia, Spain e-mail: jipsagak@si.ehu.es;Department of Linguistics, University of Maryland, College Park, MD 20742, USA e-mail: sener@wam.umd.edu

  • Venue:
  • Natural Language Engineering
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents the design and implementation of a finite-state syntactic grammar of Basque that has been used with the objective of extracting information about verb subcategorization instances from newspaper texts. After a partial parser has built basic syntactic units such as noun phrases, prepositional phrases, and sentential complements, a finite-state parser performs syntactic disambiguation, determination of clause boundaries and filtering of the results, in order to obtain a verb occurrence together with its associated syntactic components, either complements or adjuncts. The set of occurrences for each verb is then filtered by statistical measures that distinguish arguments from adjuncts.