MeSHy: Mining unanticipated PubMed information using frequencies of occurrences and concurrences of MeSH terms

  • Authors:
  • T. Theodosiou;I. S. Vizirianakis;L. Angelis;A. Tsaftaris;N. Darzentas

  • Affiliations:
  • Institute of Agrobiotechnology, Centre for Research and Technology - Hellas (CERTH), P.O. Box 361, 6km Charilaou-Thermis, GR-57001, Thessaloniki, Greece and Department of Informatics, School of Na ...;Laboratory of Pharmacology, Department of Pharmaceutical Sciences, Aristotle University of Thessaloniki, GR-54124, Thessaloniki, Greece;Department of Informatics, School of Natural Sciences, Aristotle University of Thessaloniki, GR-54124, Thessaloniki, Greece;Institute of Agrobiotechnology, Centre for Research and Technology - Hellas (CERTH), P.O. Box 361, 6km Charilaou-Thermis, GR-57001, Thessaloniki, Greece and Department of Genetics and Plant Breedi ...;Institute of Agrobiotechnology, Centre for Research and Technology - Hellas (CERTH), P.O. Box 361, 6km Charilaou-Thermis, GR-57001, Thessaloniki, Greece

  • Venue:
  • Journal of Biomedical Informatics
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Motivation: PubMed is the most widely used database of biomedical literature. To the detriment of the user though, the ranking of the documents retrieved for a query is not content-based, and important semantic information in the form of assigned Medical Subject Headings (MeSH) terms is not readily presented or productively utilized. The motivation behind this work was the discovery of unanticipated information through the appropriate ranking of MeSH term pairs and, indirectly, documents. Such information can be useful in guiding novel research and following promising trends. Methods: A web-based tool, called MeSHy, was developed implementing a mainly statistical algorithm. The algorithm takes into account the frequencies of occurrences, concurrences, and the semantic similarities of MeSH terms in retrieved PubMed documents to create MeSH term pairs. These are then scored and ranked, focusing on their unexpectedly frequent or infrequent occurrences. Results: MeSHy presents results through an online interactive interface facilitating further manipulation through filtering and sorting. The results themselves include the MeSH term pairs, along with MeSH categories, the score, and document IDs, all of which are hyperlinked for convenience. To highlight the applicability of the tool, we report the findings of an expert in the pharmacology field on querying the molecularly-targeted drug imatinib and nutrition-related flavonoids. To the best of our knowledge, MeSHy is the first publicly available tool able to directly provide such a different perspective on the complex nature of published work. Implementation and availability: Implemented in Perl and served by Apache2 at http://bat.ina.certh.gr/tools/meshy/ with all major browsers supported.