Invited paper: Structured literature image finder: Parsing text and figures in biomedical literature

Authors:
Amr Ahmed;Andrew Arnold;Luis Pedro Coelho;Joshua Kangas;Abdul-Saboor Sheikh;Eric Xing;William Cohen;Robert F. Murphy
Affiliations:
Machine Learning Department, Carnegie Mellon University, United States and Language Technologies Institute, Carnegie Mellon University, United States;Machine Learning Department, Carnegie Mellon University, United States;Joint Carnegie Mellon University-University of Pittsburgh Ph.D. Program in Computational Biology, United States and Center for Bioimage Informatics, Carnegie Mellon University, United States and L ...;Joint Carnegie Mellon University-University of Pittsburgh Ph.D. Program in Computational Biology, United States and Center for Bioimage Informatics, Carnegie Mellon University, United States and L ...;Center for Bioimage Informatics, Carnegie Mellon University, United States;Machine Learning Department, Carnegie Mellon University, United States and Language Technologies Institute, Carnegie Mellon University, United States and Joint Carnegie Mellon University-Universit ...;Machine Learning Department, Carnegie Mellon University, United States and Language Technologies Institute, Carnegie Mellon University, United States and Joint Carnegie Mellon University-Universit ...;Machine Learning Department, Carnegie Mellon University, United States and Joint Carnegie Mellon University-University of Pittsburgh Ph.D. Program in Computational Biology, United States and Cente ...
Venue:
Web Semantics: Science, Services and Agents on the World Wide Web
Year:
2010

Citing 9
Cited 2

Toward Optimal Active Learning through Sampling Estimation of Error Reduction

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Searching Online Journals for Fluorescence Microscope Images Depicting Protein Subcellular Location Patterns

BIBE '01 Proceedings of the 2nd IEEE International Symposium on Bioinformatics and Bioengineering
Genre-Based Search through Biomedical Images

ICPR '02 Proceedings of the 16 th International Conference on Pattern Recognition (ICPR'02) Volume 1 - Volume 1
Understanding captions in biomedical publications

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
High-recall protein entity recognition using a dictionary

Bioinformatics
Integrating image data into biomedical text categorization

Bioinformatics
Structured correspondence topic models for mining captioned figures in biological literature

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Exploring text and image features to classify images in bioscience literature

BioNLP '06 Proceedings of the Workshop on Linking Natural Language Processing and Biology: Towards Deeper Biological Literature Analysis
Exploring the efficacy of caption search for bioscience journal search interfaces

BioNLP '07 Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing

Applicability assessment of Semantic Web technologies

Information Processing and Management: an International Journal
Structured literature image finder: extracting information from text and images in biomedical literature

ISMB/ECCB'09 Proceedings of the 2009 workshop of the BioLink Special Interest Group, international conference on Linking Literature, Information, and Knowledge for Biology

Quantified Score

Hi-index	0.00

Visualization

Abstract

The SLIF project combines text-mining and image processing to extract structured information from biomedical literature. SLIF extracts images and their captions from published papers. The captions are automatically parsed for relevant biological entities (protein and cell type names), while the images are classified according to their type (e.g., micrograph or gel). Fluorescence microscopy images are further processed and classified according to the depicted subcellular localization. The results of this process can be queried online using either a user-friendly web-interface or an XML-based web-service. As an alternative to the targeted query paradigm, SLIF also supports browsing the collection based on latent topic models which are derived from both the annotated text and the image data. The SLIF web application, as well as labeled datasets used for training system components, is publicly available at http://slif.cbi.cmu.edu.