Learning image-to-class distance metric for image classification

  • Authors:
  • Zhengxiang Wang;Yiqun Hu;Liang-Tien Chia

  • Affiliations:
  • Nanyang Technological University, Singapore;The University of Western Australia, Crawley, WA, Australia;Nanyang Technological University, Singapore

  • Venue:
  • ACM Transactions on Intelligent Systems and Technology (TIST) - Special section on agent communication, trust in multiagent systems, intelligent tutoring and coaching systems
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Image-To-Class (I2C) distance is a novel distance used for image classification and has successfully handled datasets with large intra-class variances. However, it uses Euclidean distance for measuring the distance between local features in different classes, which may not be the optimal distance metric in real image classification problems. In this article, we propose a distance metric learning method to improve the performance of I2C distance by learning per-class Mahalanobis metrics in a large margin framework. Our I2C distance is adaptive to different classes by combining with the learned metric for each class. These multiple per-class metrics are learned simultaneously by forming a convex optimization problem with the constraints that the I2C distance from each training image to its belonging class should be less than the distances to other classes by a large margin. A subgradient descent method is applied to efficiently solve this optimization problem. For efficiency and scalability to large-scale problems, we also show how to simplify the method to learn a diagonal matrix for each class. We show in experiments that our learned Mahalanobis I2C distance can significantly outperform the original Euclidean I2C distance as well as other distance metric learning methods in several prevalent image datasets, and our simplified diagonal matrices can preserve the performance but significantly speed up the metric learning procedure for large-scale datasets. We also show in experiment that our method is able to correct the class imbalance problem, which usually leads the NN-based methods toward classes containing more training images.