An introduction to support Vector Machines: and other kernel-based learning methods
An introduction to support Vector Machines: and other kernel-based learning methods
Information Retrieval
Digital Image Processing
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
BIBE '01 Proceedings of the 2nd IEEE International Symposium on Bioinformatics and Bioengineering
Use of Figures in Literature Mining for Biomedical Digital Libraries
DIAL '06 Proceedings of the Second International Conference on Document Image Analysis for Libraries
The WEKA data mining software: an update
ACM SIGKDD Explorations Newsletter
LIBSVM: A library for support vector machines
ACM Transactions on Intelligent Systems and Technology (TIST)
An Automatic System for Extracting Figures and Captions in Biomedical PDF Documents
BIBM '11 Proceedings of the 2011 IEEE International Conference on Bioinformatics and Biomedicine
Automatic figure classification in bioscience literature
Journal of Biomedical Informatics
Hi-index | 0.00 |
Proteins are complex biological polymers that mediate virtually all cellular functions. Typically these functions are modulated by protein-protein interactions (PPI). Tremendous efforts have been made by life scientists to detect PPIs through different experimental approaches and document the results through publications. On the informatics front, however, there lacks an effective means for retrieving PPI information from published literatures. In this work we present a novel framework for identifying experimental methods employed for analyzing PPI from biomedical articles. Different from state-of-the-art approaches based only on text, we explore using the combination of attributes from figures, figure captions, and text within figures for identifying PPI experimental methods. Our work is motivated by the observation that biomedical figures often constitute direct evidence of experimental results and therefore provide complementary information to texts. We start with automatically extracting unimodal panels (subfigures) and their associated subcaptions and then classifying the subfigure into different types using a proposed hierarchical image taxonomy. Next, we combine the subfigure types with text-based features to form a hybrid feature descriptor and use it for PPI method classification. We further construct a dataset starting from a set of 2,256 documents provided by the molecular interaction database MINT. Here we show that our new approach outperforms the text-only solution for associating figures with PPI methods.