Multiple instance learning based on positive instance selection and bag structure construction

  • Authors:
  • Zhan Li;Guo-Hua Geng;Jun Feng;Jin-Ye Peng;Chao Wen;Jun-Li Liang

  • Affiliations:
  • School of Information Science and Technology, Northwest University, Xi'an 710069, China;School of Information Science and Technology, Northwest University, Xi'an 710069, China;School of Information Science and Technology, Northwest University, Xi'an 710069, China;School of Information Science and Technology, Northwest University, Xi'an 710069, China;School of Information Science and Technology, Northwest University, Xi'an 710069, China;School of Computer Science and Engineering and School of Automation and Information Engineering, Xi'an University of Technology, Xi'an 710048, China

  • Venue:
  • Pattern Recognition Letters
  • Year:
  • 2014

Quantified Score

Hi-index 0.10

Visualization

Abstract

Previous studies on multiple instance learning (MIL) have shown that the MIL problem holds three characteristics: positive instance clustering, bag structure and instance probabilistic influence to bag label. In this paper, combined with the advantages of these three characteristics, we propose two simple yet effective MIL algorithms, CK_MIL and ck_MIL. We take three steps to convert MIL to a standard supervised learning problem. In the first step, we perform K-means clustering algorithm on the positive and negative sets separately to obtain the cluster centers, further use them to select the most positive instances in bags. Next, we combine three distances, including the maximum, minimum and the average distances from bag to cluster centers, as bag structure. For CK_MIL, we simply compose the positive instance and bag structure to form a new vector as bag representation, then apply RBF kernel to measure bag similarity, while for ck_MIL algorithm we construct a new kernel by introducing a probabilistic coefficient to balance the influences between the positive instance similarity and bag structure similarity. As a result, the MIL problem is converted to a standard supervised learning problem that can be solved directly by SVM method. Experiments on MUSK and COREL image set have shown that our two algorithms perform better than other key existing MIL algorithms on the drug prediction and image classification tasks.