A partially supervised metric multidimensional scaling algorithm for textual data visualization

  • Authors:
  • Ángela Blanco;Manuel Martín-Merino

  • Affiliations:
  • Universidad Pontificia de Salamanca, Salamanca, Spain;Universidad Pontificia de Salamanca, Salamanca, Spain

  • Venue:
  • IDA'07 Proceedings of the 7th international conference on Intelligent data analysis
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Multidimensional Scaling Algorithms (MDS) allow us to visualize high dimensional object relationships in an intuitive way. An interesting application of the MDS algorithms is the visualization of the semantic relations among documents or terms in textual databases. However, the MDS algorithms proposed in the literature exhibit a low discriminant power. The unsupervised nature of the algorithms and the 'curse of dimensionality' favor the overlapping among different topics in the map. This problem can be overcome considering that many textual collections provide frequently a categorization for a small subset of documents. In this paper we define new semi-supervised measures that reflect better the semantic classes of the textual collection considering the a priori categorization of a subset of documents. Next the dissimilarities are incorporated into the Torgerson MDS algorithm to improve the separation among topics in the map. The experimental results show that the model proposed outperforms well known unsupervised alternatives.