Nearest-neighbor method using multiple neighborhood similarities for social media data mining

Authors:
Shuhui Wang;Qingming Huang;Shuqiang Jiang;Qi Tian;Lei Qin
Affiliations:
Key Laboratory of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing 100190, China;Key Laboratory of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing 100190, China and Graduate University, Chinese Academy of ...;Key Laboratory of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing 100190, China;Department of Computer Science, University of Texas at San Antonio, TX 78249, USA;Key Laboratory of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing 100190, China
Venue:
Neurocomputing
Year:
2012

Citing 34
Cited 0

Combining labeled and unlabeled data with co-training

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Multidimensional binary search trees used for associative searching

Communications of the ACM
Similarity estimation techniques from rounding algorithms

STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
R-trees: a dynamic index structure for spatial searching

SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Similarity Search in High Dimensions via Hashing

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Locality-sensitive hashing scheme based on p-stable distributions

SCG '04 Proceedings of the twentieth annual symposium on Computational geometry
Multiple kernel learning, conic duality, and the SMO algorithm

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Multi-level annotation of natural scenes using dominant image components and semantic concepts

Proceedings of the 12th annual ACM international conference on Multimedia
Semi-supervised protein classification using cluster kernels

Bioinformatics
Early versus late fusion in semantic video analysis

Proceedings of the 13th annual ACM international conference on Multimedia
Early versus late fusion in semantic video analysis

Proceedings of the 13th annual ACM international conference on Multimedia
Beyond the point cloud: from transductive to semi-supervised learning

ICML '05 Proceedings of the 22nd international conference on Machine learning
SVM-KNN: Discriminative Nearest Neighbor Classification for Visual Category Recognition

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Pattern Recognition and Machine Learning (Information Science and Statistics)

Pattern Recognition and Machine Learning (Information Science and Statistics)
Large Scale Multiple Kernel Learning

The Journal of Machine Learning Research
Information-theoretic metric learning

Proceedings of the 24th international conference on Machine learning
Model-shared subspace boosting for multi-label classification

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Representing shape with a spatial pyramid kernel

Proceedings of the 6th ACM international conference on Image and video retrieval
Localized multiple kernel learning

Proceedings of the 25th international conference on Machine learning
An RKHS for multi-view learning and manifold co-regularization

Proceedings of the 25th international conference on Machine learning
80 Million Tiny Images: A Large Data Set for Nonparametric Object and Scene Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence
Bayesian video search reranking

MM '08 Proceedings of the 16th ACM international conference on Multimedia
The MIR flickr retrieval evaluation

MIR '08 Proceedings of the 1st ACM international conference on Multimedia information retrieval
Distance Metric Learning for Large Margin Nearest Neighbor Classification

The Journal of Machine Learning Research
Descriptive visual words and visual phrases for image applications

MM '09 Proceedings of the 17th ACM international conference on Multimedia
Distance metric learning from uncertain side information with application to automated photo tagging

MM '09 Proceedings of the 17th ACM international conference on Multimedia
Unified video annotation via multigraph learning

IEEE Transactions on Circuits and Systems for Video Technology
NUS-WIDE: a real-world web image database from National University of Singapore

Proceedings of the ACM International Conference on Image and Video Retrieval
Beyond distance measurement: constructing neighborhood similarity for video annotation

IEEE Transactions on Multimedia - Special section on communities and media computing
New trends and ideas in visual concept detection: the MIR flickr retrieval evaluation initiative

Proceedings of the international conference on Multimedia information retrieval
S3MKL: scalable semi-supervised multiple kernel learning for image data mining

Proceedings of the international conference on Multimedia
Nearest-neighbor classification using unlabeled data for real world image application

Proceedings of the international conference on Multimedia
Multiple Kernel Learning with High Order Kernels

ICPR '10 Proceedings of the 2010 20th International Conference on Pattern Recognition
Video accessibility enhancement for hearing-impaired users

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP) - Special section on ACM multimedia 2010 best paper candidates, and issue on social media

Quantified Score

Hi-index	0.01

Visualization

Abstract

Currently, Nearest-Neighbor approaches (NN) have been applied to large scale real world image data mining. However, the following three disadvantages prevent them from wider application compared to other machine learning methods: (i) the performance is inferior on small datasets; (ii) the performance will degrade for data with high dimensions; (iii) they are heavily dependent on the chosen feature and distance measure. In this paper, we try to overcome the three mentioned intrinsic weaknesses by taking the abundant and diversified content of social media images into account. Firstly, we propose a novel neighborhood similarity measure which encodes both the local density information and semantic information, thus it has better generalization power than the original image-to-image similarity. Secondly, to enhance the scalability, we adopt kernelized Locality Sensitive Hashing (KLSH) to conduct approximated nearest neighbor search by utilizing a set of kernels calculated on several complementary image features. Finally, to enhance the robustness on diversified genres of images, we propose to fuse the discrimination power of different features by combining multiple neighborhood similarities calculated on different features/kernels with the entire retrieved nearest labeled and unlabeled image via the hashing systems. Experimental results on visual categorization on the Caltech-256 and two social media databases show the advantage of our method over traditional NN methods using the labeled data only.