Finding Semantically Related Words in Large Corpora

Authors:
Pavel Smrz;Pavel Rychlý
Affiliations:
-;-
Venue:
TSD '01 Proceedings of the 4th International Conference on Text, Speech and Dialogue
Year:
2001

Citing 8
Cited 2

Algorithms for clustering data

Algorithms for clustering data
Word association norms, mutual information, and lexicography

Computational Linguistics
Corpus processing for lexical acquisition

Corpus processing for lexical acquisition
Evaluation techniques for automatic semantic extraction: comparing syntactic and window based approaches

Corpus processing for lexical acquisition
Foundations of statistical natural language processing

Foundations of statistical natural language processing
Statistical Pattern Recognition: A Review

IEEE Transactions on Pattern Analysis and Machine Intelligence
Explorations in Automatic Thesaurus Discovery

Explorations in Automatic Thesaurus Discovery
Retrieving collocations from text: Xtract

Computational Linguistics - Special issue on using large corpora: I

Random indexing distributional semantic models for Croatian language

TSD'11 Proceedings of the 14th international conference on Text, speech and dialogue
Enriching wordnet with derivational subnets

CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

The paper deals with the linguistic problem of fully automatic grouping of semantically related words. We discuss the measures of semantic relatedness of basic word forms and describe the treatment of collocations. Next we present the procedure of hierarchical clustering of a very large number of semantically related words and give examples of the resulting partitioning of data in the form of dendrogram. Finally we show a form of the output presentation that facilitates the inspection of the resulting word clusters.