Medical image retrieval using bag of meaningful visual words: unsupervised visual vocabulary pruning with PLSA

Authors:
Antonio Foncubierta-Rodríguez;Alba García Seco de Herrera;Henning Müller
Affiliations:
University of Applied Sciences Western Switzerland, Sierre, Switzerland;University of Applied Sciences Western Switzerland, Sierre, Switzerland;University of Applied Sciences Western Switzerland, Sierre, Switzerland
Venue:
Proceedings of the 1st ACM international workshop on Multimedia indexing and information retrieval for healthcare
Year:
2013

Citing 17
Cited 0

Unsupervised learning by probabilistic latent semantic analysis

Machine Learning
Principles of data mining

Principles of data mining
Semi-supervised Clustering by Seeding

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Improving Response Time by Search Pruning in a Content-Based Image Retrieval System, Using Inverted File Techniques

CBAIVL '99 Proceedings of the IEEE Workshop on Content-Based Access of Image and Video Libraries
Distinctive Image Features from Scale-Invariant Keypoints

International Journal of Computer Vision
Integrating constraints and metric learning in semi-supervised clustering

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Automatically Finding Images for Clinical Decision Support

ICDMW '07 Proceedings of the Seventh IEEE International Conference on Data Mining Workshops
Language modeling for bag-of-visual words image categorization

CIVR '08 Proceedings of the 2008 international conference on Content-based image and video retrieval
DENCLUE 2.0: fast clustering based on kernel density estimation

IDA'07 Proceedings of the 7th international conference on Intelligent data analysis
Bag-of-visual-words and spatial extensions for land-use classification

Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems
PCA-SIFT: a more distinctive representation for local image descriptors

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Building descriptive and discriminative visual codebook for large-scale image applications

Multimedia Tools and Applications
Bag-of-colors for improved image search

MM '11 Proceedings of the 19th ACM international conference on Multimedia
Scene classification via pLSA

ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part IV
Superpixel-Based interest points for effective bags of visual words medical image retrieval

MCBR-CDS'11 Proceedings of the Second MICCAI international conference on Medical Content-Based Retrieval for Clinical Decision Support
Using multiscale visual words for lung texture classification and retrieval

MCBR-CDS'11 Proceedings of the Second MICCAI international conference on Medical Content-Based Retrieval for Clinical Decision Support
Toward a higher-level visual representation for content-based image retrieval

Multimedia Tools and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Content--based medical image retrieval has been proposed as a technique that allows not only for easy access to images from the relevant literature and electronic health records but also for training physicians, for research and clinical decision support. The bag-of-visual-words approach is a widely used technique that tries to shorten the semantic gap by learning meaningful features from the dataset and describing documents and images in terms of the histogram of these features. Visual vocabularies are often redundant, over--complete and noisy. Larger than required vocabularies lead to high--dimensional feature spaces, which present important disadvantages with the curse of dimensionality and computational cost being the most obvious ones. In this work a visual vocabulary pruning technique is presented. It enormously reduces the amount of required words to describe a medical image dataset with no significant effect on the accuracy. Results show that a reduction of up to 90% can be achieved without impact on the system performance. Obtaining a more compact representation of a document enables multimodal description as well as using classifiers requiring low--dimensional representations.