A language modeling approach to information retrieval
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Machine Learning
Toward Optimal Active Learning through Sampling Estimation of Error Reduction
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
BIBE '01 Proceedings of the 2nd IEEE International Symposium on Bioinformatics and Bioengineering
Genre-Based Search through Biomedical Images
ICPR '02 Proceedings of the 16 th International Conference on Pattern Recognition (ICPR'02) Volume 1 - Volume 1
Understanding captions in biomedical publications
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
High-recall protein entity recognition using a dictionary
Bioinformatics
Integrating image data into biomedical text categorization
Bioinformatics
Structured correspondence topic models for mining captioned figures in biological literature
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Exploring text and image features to classify images in bioscience literature
BioNLP '06 Proceedings of the Workshop on Linking Natural Language Processing and Biology: Towards Deeper Biological Literature Analysis
Invited paper: Structured literature image finder: Parsing text and figures in biomedical literature
Web Semantics: Science, Services and Agents on the World Wide Web
Hi-index | 0.00 |
Slif uses a combination of text-mining and image processing to extract information from figures in the biomedical literature. It also uses innovative extensions to traditional latent topic modeling to provide new ways to traverse the literature. Slif provides a publicly available searchable database (http://slif.cbi.cmu.edu). Slif originally focused on fluorescence microscopy images. We have now extended it to classify panels into more image types. We also improved the classification into subcellular classes by building a more representative training set. To get the most out of the human labeling effort, we used active learning to select images to label. We developed models that take into account the structure of the document (with panels inside figures inside papers) and the multi-modality of the information (free and annotated text, images, information from external databases). This has allowed us to provide new ways to navigate a large collection of documents.