Three Approaches to Word Sense Disambiguation for Czech

Authors:
Robert Král
Affiliations:
-
Venue:
TSD '01 Proceedings of the 4th International Conference on Text, Speech and Dialogue
Year:
2001

Citing 4
Cited 0

Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone

SIGDOC '86 Proceedings of the 5th annual international conference on Systems documentation
DESAM - Annotated Corpus for Czech

SOFSEM '97 Proceedings of the 24th Seminar on Current Trends in Theory and Practice of Informatics: Theory and Practice of Informatics
Word Senses and Semantic Representations

TDS '00 Proceedings of the Third International Workshop on Text, Speech and Dialogue
Introduction to the special issue on word sense disambiguation: the state of the art

Computational Linguistics - Special issue on word sense disambiguation

Quantified Score

Hi-index	0.00

Visualization

Abstract

Before building a full wsd system it is necessary to have a balanced and representative corpus annotated with sense tags. This requirement is not certainly fulfilled for the Czech language. Thus, we decided to develop some particular methods for annotating texts and we have started with the most common nouns. In our approach, the disambiguation algorithm based on sets of words (called bags) was used. The advantage of this approach is the possibility of filling bags in various ways. Our ultimate goal is to reduce manual work as much as possible. Here we present three basic ways of filling bags. The first one is based on the machine readable version of SSJČ, the second takes the advantage of learning from manually annotated text and the strategy of pseudoclustering is the third one.