Learning distance metric for regression by semidefinite programming with application to human age estimation

  • Authors:
  • Bo Xiao;Xiaokang Yang;Yi Xu;Hongyuan Zha

  • Affiliations:
  • Institute of Image Communication and Information Processing, Shanghai Jiao Tong University, Shanghai, China;Institute of Image Communication and Information Processing, Shanghai Jiao Tong University, Shanghai, China;Institute of Image Communication and Information Processing, Shanghai Jiao Tong University, Shanghai, China;Georgia Institute of Technology, Atlanta, GA, USA

  • Venue:
  • MM '09 Proceedings of the 17th ACM international conference on Multimedia
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

A good distance metric for the input data is crucial in many pattern recognition and machine learning applications. Past studies have demonstrated that learning a metric from labeled samples can significantly improve the performance of classification and clustering algorithms. In this paper, we investigate the problem of learning a distance metric that measures the semantic similarity of input data for regression problems. The particular application we consider is human age estimation. Our guiding principle for learning the distance metric is to preserve the local neighborhoods based on a specially designed distance as well as to maximize the distances between data that are not in the same neighborhood in the semantic space.Without any assumption about the structure and the distribution of the input data, we show that this can be done by using semidefinite programming. Furthermore, the low-level feature space can be mapped to the high-level semantic space by a linear transformation with very low computational cost. Experimental results on the publicly available FG-NET database show that 1) the learned metric correctly discovers the semantic structure of the data even when the amount of training data is small and 2) significant improvement over the traditional Euclidean metric for regression can be obtained using the learned metric. Most importantly, simple regression methods such as k nearest neighbors (kNN), combined with our learned metric, become quite competitive (and sometimes even superior) in terms of accuracy when compared with the state-of-the-art human age estimation approaches.