A partially supervised metric multidimensional scaling algorithm for textual data visualization

Authors:
Ángela Blanco;Manuel Martín-Merino
Affiliations:
Universidad Pontificia de Salamanca, Salamanca, Spain;Universidad Pontificia de Salamanca, Salamanca, Spain
Venue:
IDA'07 Proceedings of the 7th international conference on Intelligent data analysis
Year:
2007

Citing 13
Cited 1

Latent semantic indexing is an optimal special case of multidimensional scaling

SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Self-organizing maps

Self-organizing maps
Internet browsing and searching: user evaluations of category map and concept space techniques

Journal of the American Society for Information Science - Special topic issue: artificial intelligence techniques for emerging information systems applications
Foundations of statistical natural language processing

Foundations of statistical natural language processing
Matrices, Vector Spaces, and Information Retrieval

SIAM Review
Re-designing distance functions and distance-based applications for high dimensional data

ACM SIGMOD Record
Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms

Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
A Comparative Study on Feature Selection in Text Categorization

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
On Using Partial Supervision for Text Categorization

IEEE Transactions on Knowledge and Data Engineering
A New Sammon Algorithm for Sparse Data Visualization

ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 1 - Volume 01
Relaxational metric adaptation and its application to semi-supervised clustering and content-based image retrieval

Pattern Recognition
Learning from labeled and unlabeled data using a minimal number of queries

IEEE Transactions on Neural Networks
Artificial neural networks for feature extraction and multivariate data projection

IEEE Transactions on Neural Networks

Relevance based visualization of large cancer patient populations

Proceedings of the 1st ACM International Health Informatics Symposium

Quantified Score

Hi-index	0.00

Visualization

Abstract

Multidimensional Scaling Algorithms (MDS) allow us to visualize high dimensional object relationships in an intuitive way. An interesting application of the MDS algorithms is the visualization of the semantic relations among documents or terms in textual databases. However, the MDS algorithms proposed in the literature exhibit a low discriminant power. The unsupervised nature of the algorithms and the 'curse of dimensionality' favor the overlapping among different topics in the map. This problem can be overcome considering that many textual collections provide frequently a categorization for a small subset of documents. In this paper we define new semi-supervised measures that reflect better the semantic classes of the textual collection considering the a priori categorization of a subset of documents. Next the dissimilarities are incorporated into the Torgerson MDS algorithm to improve the separation among topics in the map. The experimental results show that the model proposed outperforms well known unsupervised alternatives.