Combining supervised and unsupervised models via unconstrained probabilistic embedding

  • Authors:
  • Xiang Ao;Ping Luo;Xudong Ma;Fuzhen Zhuang;Qing He;Zhongzhi Shi;Zhiyong Shen

  • Affiliations:
  • -;-;-;-;-;-;-

  • Venue:
  • Information Sciences: an International Journal
  • Year:
  • 2014

Quantified Score

Hi-index 0.07

Visualization

Abstract

In this study, we consider an ensemble problem in which we combine outputs coming from models developed in the supervised and unsupervised modes. By jointly considering the grouping results coming from unsupervised models we aim to improve the classification accuracy of supervised model ensemble. Here, we formulate the ensemble task as an Unconstrained Probabilistic Embedding (UPE) problem. Specifically, we assume both objects and classes/clusters have latent coordinates without constraints in a D-dimensional Euclidean space, and consider the mapping from the embedded space into the space of model results as a probabilistic generative process. A solution to this embedding can be obtained using the quasi-Newton method, which makes objects and classes/clusters with high co-occurrence weights are embedded close. Then, prediction is determined by taking the distances between the object and the classes in the embedded space. We demonstrate the benefits of this unconstrained embedding method by running extensive and systematic experiments on real-world datasets. Furthermore, we conduct experiments to investigate how the quality and the number of clustering models affect the performance of this ensemble method. We also show the robustness of the proposed model.