Term Clustering Using a Corpus-Based Similarity Measure

Authors:
Goran Nenadic;Irena Spasic;Sophia Ananiadou
Affiliations:
-;-;-
Venue:
TSD '02 Proceedings of the 5th International Conference on Text, Speech and Dialogue
Year:
2002

Citing 3
Cited 1

Supervised Learning of Term Similarities

IDEAL '02 Proceedings of the Third International Conference on Intelligent Data Engineering and Automated Learning
The ATRACT Workbench: Automatic Term Recognition and Clustering for Terms

TSD '01 Proceedings of the 4th International Conference on Text, Speech and Dialogue
Automatic acquisition of hyponyms from large text corpora

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2

Automatic discovery of term similarities using pattern mining

COMPUTERM '02 COLING-02 on COMPUTERM 2002: second international workshop on computational terminology - Volume 14

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we present a method for the automatic term clustering. The method uses a hybrid similarity measure to cluster terms automatically extracted from a corpus by applying the C/NC-value method. The measure comprises contextual, functional and lexical similarity, and it is used to instantiate the cell values in a similarity matrix. The clustering algorithm uses either the nearest neighbour or the Ward's method to calculate the distance between clusters. The approach has been tested and evaluated in the domain of molecular biology and the results are presented.