Synthetic pattern generation for imbalanced learning in image retrieval

Authors:
Luca Piras;Giorgio Giacinto
Affiliations:
Department of Electrical and Electronic Engineering, University of Cagliari, 09123 Piazza D'armi, Cagliari, Italy;Department of Electrical and Electronic Engineering, University of Cagliari, 09123 Piazza D'armi, Cagliari, Italy
Venue:
Pattern Recognition Letters
Year:
2012

Citing 14
Cited 0

LOF: identifying density-based local outliers

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Content-Based Image Retrieval at the End of the Early Years

IEEE Transactions on Pattern Analysis and Machine Intelligence
A study of the behavior of several methods for balancing machine learning training data

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Content-based multimedia information retrieval: State of the art and challenges

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Asymmetric Bagging and Random Subspace for Support Vector Machines-Based Relevance Feedback in Image Retrieval

IEEE Transactions on Pattern Analysis and Machine Intelligence
A nearest-neighbor approach to relevance feedback in content based image retrieval

Proceedings of the 6th ACM international conference on Image and video retrieval
Image retrieval: Ideas, influences, and trends of the new age

ACM Computing Surveys (CSUR)
Real-Time Computerized Annotation of Pictures

IEEE Transactions on Pattern Analysis and Machine Intelligence
Lire: lucene image retrieval: an extensible java CBIR library

MM '08 Proceedings of the 16th ACM international conference on Multimedia
Learning from Imbalanced Data

IEEE Transactions on Knowledge and Data Engineering
SMOTE: synthetic minority over-sampling technique

Journal of Artificial Intelligence Research
CEDD: color and edge directivity descriptor: a compact descriptor for image indexing and retrieval

ICVS'08 Proceedings of the 6th international conference on Computer vision systems
Direct kernel biased discriminant analysis: a new content-based image retrieval relevance feedback algorithm

IEEE Transactions on Multimedia
k-nearest neighbors directed noise injection in multilayer perceptron training

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.10

Visualization

Abstract

Nowadays very large archives of digital images are easily produced thanks to the wide availability of digital cameras, that are often embedded into a number of portable devices. One of the ways of exploring an image archive is to search for similar images. Relevance feedback mechanisms can be employed to refine the search, as the most similar images according to a set of visual features may not contain the same semantic concepts according to the users' needs. Relevance feedback allows users to label the images returned by the system as being relevant or not. Then, this labelled set is used to learn the characteristics of relevant images. As the number of images provided to users to receive feedback is usually quite small, and relevant images typically represent a tiny fraction, it turns out that the learning problem is heavily imbalanced. In order to reduce this imbalance, this paper proposes the use of techniques aimed at artificially increasing the number of examples of the relevant class. The new examples are generated as new points in the feature space so that they are in agreement with the local distribution of the available relevant examples. The locality of the proposed approach makes it quite suited to relevance feedback techniques based on the Nearest-Neighbor (NN) paradigm. The effectiveness of the proposed approach is assessed on two image datasets and comparisons with editing techniques that eliminate redundancies in non-relevant examples are also reported.