Valency lexicon of czech verbs VALLEX: recent experiments with frame disambiguation

  • Authors:
  • Markéta Lopatková;Ondeřj Bojar;Jiří Semecký;Václava Benešová;Zdeněk Žabokrtský

  • Affiliations:
  • Institute of Formal and Applied Linguistics Faculty of Mathematics and Physics, Charles University, Prague;Institute of Formal and Applied Linguistics Faculty of Mathematics and Physics, Charles University, Prague;Institute of Formal and Applied Linguistics Faculty of Mathematics and Physics, Charles University, Prague;Institute of Formal and Applied Linguistics Faculty of Mathematics and Physics, Charles University, Prague;Institute of Formal and Applied Linguistics Faculty of Mathematics and Physics, Charles University, Prague

  • Venue:
  • TSD'05 Proceedings of the 8th international conference on Text, Speech and Dialogue
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

VALLEX is a linguistically annotated lexicon aiming at a description of syntactic information which is supposed to be useful for NLP. The lexicon contains roughly 2500 manually annotated Czech verbs with over 6000 valency frames (summer 2005). In this paper we introduce VALLEX and describe an experiment where VALLEX frames were assigned to 10,000 corpus instances of 100 Czech verbs – the pairwise inter-annotator agreement reaches 75%. The part of the data where three human annotators agreed were used for an automatic word sense disambiguation task, in which we achieved the precision of 78.5%.