Semi-supervised learning with local and global consistency by geodesic distance and sparse representation

  • Authors:
  • Jie Gui;Zhongqiu Zhao;Rongxiang Hu;Wei Jia

  • Affiliations:
  • Intelligent Computing Lab, Hefei Institute of Intelligent Machines, Chinese Academy of Sciences, Hefei, Anhui, China,State Key Laboratory for Novel Software Technology, Nanjing University, P.R. Ch ...;College of Computer Science and Information Engineering, Hefei University of Technology, China;Institue of Energy Safety, Chinese Academy of Sciences, Hefei, Anhui, China;Institue of Energy Safety, Chinese Academy of Sciences, Hefei, Anhui, China

  • Venue:
  • IScIDE'12 Proceedings of the third Sino-foreign-interchange conference on Intelligent Science and Intelligent Data Engineering
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

In many practical data mining applications, such as web categorization, key gene selection, etc., generally, unlabeled training examples are readily available, but labeled ones are fairly expensive to obtain. Therefore, in recent years, semi-supervised learning algorithms such as graph-based methods have attracted much attention. However, most of these traditional methods adopted a Gaussian function to calculate the edge weights of the graph. In this paper, a novel weight for semi-supervised graph-based methods is proposed. The new method adds the label information from problem into the target function, and adopts the geodesic distance rather than Euclidean distance as the measure of the difference between two data points when conducting the calculation. In addition, we also add class prior knowledge from problem into semi-supervised learning algorithm. Here we address the problem of learning with local and global consistency (LGC). It was found that the effect of class prior knowledge was probably different between under high label rate and low label rate. Furthermore, we integrate sparse representation (SR) in LGC algorithm. Experiments on a UCI data set show that our proposed method outperforms the original algorithms.