Combining supervised and unsupervised models via unconstrained probabilistic embedding

Authors:
Xudong Ma;Ping Luo;Fuzhen Zhuang;Qing He;Zhongzhi Shi;Zhiyong Shen
Affiliations:
The Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences and Graduate University of Chinese Academy of Sciences;Hewlett Packard Labs China;The Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences and Graduate University of Chinese Academy of Sciences;The Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences;The Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences;Hewlett Packard Labs China
Venue:
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two
Year:
2011

Citing 5
Cited 1

On the limited memory BFGS method for large scale optimization

Mathematical Programming: Series A and B
Automating the Construction of Internet Portals with Machine Learning

Information Retrieval
Cluster ensembles --- a knowledge reuse framework for combining multiple partitions

The Journal of Machine Learning Research
Solving cluster ensemble problems by bipartite graph partitioning

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Probabilistic latent semantic visualization: topic model for visualizing documents

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining

Combining supervised and unsupervised models via unconstrained probabilistic embedding

Information Sciences: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Ensemble learning with output from multiple supervised and unsupervised models aims to improve the classification accuracy of supervised model ensemble by jointly considering the grouping results from unsupervised models. In this paper we cast this ensemble task as an unconstrained probabilistic embedding problem. Specifically, we assume both objects and classes/clusters have latent coordinates without constraints in a D-dimensional Euclidean space, and consider the mapping from the embedded space into the space of results from supervised and unsupervised models as a probabilistic generative process. The prediction of an object is then determined by the distances between the object and the classes in the embedded space. A solution of this embedding can be obtained using the quasi-Newton method, resulting in the objects and classes/clusters with high co-occurrence weights being embedded close. We demonstrate the benefits of this unconstrained embedding method by three real applications.