Phrase clustering without document context

  • Authors:
  • Eric SanJuan;Fidelia Ibekwe-SanJuan

  • Affiliations:
  • URI, INIST-CNRS, LITA, University of Metz, France;URSIDOC, University of Lyon 3, France

  • Venue:
  • ECIR'06 Proceedings of the 28th European conference on Advances in Information Retrieval
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

We applied different clustering algorithms to the task of clustering multi-word terms in order to reflect a humanly built ontology. Clustering was done without the usual document co-occurrence information. Our clustering algorithm, CPCL (Classification by Preferential Clustered Link) is based on general lexico-syntactic relations which do not require prior domain knowledge or the existence of a training set. Results show that CPCL performs well in terms of cluster homogeneity and shows good adaptability for handling large and sparse matrices.