P-Biblio-MetReS, a parallel data mining tool for the reconstruction of molecular networks

Authors:
Ivan Teixidó;Anabel Usié;Josep Ll. Lérida;Francesc Solsona;Jorge Comas;Nestor Torres;Hiren Karathia;Rui Alves
Affiliations:
University of Lleida, Lleida, Spain;University of Lleida, Lleida, Spain;University of Lleida, Lleida, Spain;University of Lleida, Lleida, Spain;University of Lleida, Lleida, Spain;University of Lleida, Lleida, Spain;University of Lleida, Lleida, Spain;University of Lleida, Lleida, Spain
Venue:
Proceedings of the 20th European MPI Users' Group Meeting
Year:
2013

Citing 4
Cited 0

Implementing the iHOP concept for navigation of biomedical literature

Bioinformatics
Semantic Mining in Biomedicine (Introduction to the papers selected from the SMBM 2005 Symposium, Hinxton, U.K., April 2005)

Bioinformatics
PathText

Bioinformatics
A survey and comparison of peer-to-peer overlay network schemes

IEEE Communications Surveys & Tutorials

Quantified Score

Hi-index	0.00

Visualization

Abstract

Biblio-MetReS is a single-thread data mining application that facilitates the reconstruction of molecular networks based on automated text mining analysis of published scientific literature. This application is very CPU-intensive, requiring High Performace Computing (HPC). Due to the amount of execution tasks, it can be quite slow. Those tasks are repetitive and consist in mining the information from large sets of scientific documents, a process where the time-cost of the application could be improved through paralellization. This paper presents a parallel version of Biblio-MetReS. The multithreading application P(arallel)-Biblio-MetReS distributes the work among copies of the same Java class, each mining a collection of documents obtained in a previous search phase from different literature sources of Internet. In this article, we compare performances between the parallel and non-parallel versions of the application and discuss scalability issues on multi-threading systems in the context of this application. Furthermore, we also optimize memory management and reutilization of document parsing results. Our experimental results corroborate the good performance of P-Biblio-MetReS, pinpointing specific aspects that still need to be improved.