Probabilistic latent semantic indexing
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Content-Based Image Retrieval at the End of the Early Years
IEEE Transactions on Pattern Analysis and Machine Intelligence
Saliency, Scale and Image Description
International Journal of Computer Vision
Information Retrieval
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
The Journal of Machine Learning Research
Video Google: A Text Retrieval Approach to Object Matching in Videos
ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Understanding captions in biomedical publications
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Distinctive Image Features from Scale-Invariant Keypoints
International Journal of Computer Vision
A Bayesian Hierarchical Model for Learning Natural Scene Categories
CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 2 - Volume 02
Statistical entity-topic models
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
International Journal of Computer Vision
Towards optimal bag-of-features for object categorization and semantic video retrieval
Proceedings of the 6th ACM international conference on Image and video retrieval
Evaluating bag-of-visual-words representations in scene classification
Proceedings of the international workshop on Workshop on multimedia information retrieval
Annotating images and image objects using a hierarchical dirichlet process model
Proceedings of the 9th International Workshop on Multimedia Data Mining: held in conjunction with the ACM SIGKDD 2008
MaxMatcher: biological concept extraction using approximate dictionary lookup
PRICAI'06 Proceedings of the 9th Pacific Rim international conference on Artificial intelligence
The topic-perspective model for social tagging systems
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
A probabilistic topic-connection model for automatic image annotation
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Perspective hierarchical dirichlet process for user-tagged image modeling
Proceedings of the 20th ACM international conference on Information and knowledge management
Towards noise-resilient document modeling
Proceedings of the 20th ACM international conference on Information and knowledge management
On handling textual errors in latent document modeling
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Hi-index | 0.00 |
Biomedical images and captions are one of the major sources of information in online biomedical publications. They often contain the most important results to be reported, and provide rich information about the main themes in published papers. In the data mining and information retrieval community, there has been much effort on using text mining and language modeling algorithms to extract knowledge from the text content of online biomedical publications; however, the problem of knowledge extraction from biomedical images and captions has not been fully studied yet. In this paper, a hierarchical probabilistic topic model with background distribution (HPB) is introduced to uncover the latent semantic topics from the co-occurrence patterns of caption words, visual words and biomedical concepts. With downloaded biomedical figures, restricted captions are extracted with regard to each individual image panel. During the indexing stage, the 'bag-of-words' representation of captions is supplemented by an ontology-based concept indexing to alleviate the synonym and polysemy problems. As the visual counterpart of text words, the visual words are extracted and indexed from corresponding image panels. The model is estimated via collapsed Gibbs sampling algorithm. We compare the performance of our model with the extension of the Correspondence LDA (Corr-LDA) model under the same biomedical image annotation scenario using cross-validation. Experimental results demonstrate that our model is able to accurately extract latent patterns from complicated biomedical image-caption pairs and facilitate knowledge organization and understanding in online biomedical literatures.