Networked mining of atomic and molecular data from electronic journal databases on the internet

  • Authors:
  • Lukáš Pichl;Manabu Suzuki;Kazuyuki Joe;Akira Sasaki

  • Affiliations:
  • Department of Computer Software, University of Aizu, Aizuwakamatsu, Japan;Department of Computer Software, University of Aizu, Aizuwakamatsu, Japan;Department of Information and Computer Sciences, Nara Women's University, Nara, Japan;Kansai Research Establishment, Japan Atomic Energy Research Institute, Kyoto, Japan

  • Venue:
  • DNIS'05 Proceedings of the 4th international conference on Databases in Networked Information Systems
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Several centers of atomic and molecular data in the world maintain research databases for use in fusion plasma simulations, hadron therapy, modelling the universe and other areas. Among the data center activities, collection of experimental and theoretical results across the world has been of major importance. This includes the identification, relevance assessment and retrieval of journal articles, followed by the data extraction, data mining, format conversion and data input. The methodology of the process still largely relies on working groups of specialists and part-time human labor, in spite of recent modernization in journal publishing, especially the electronic journals newly available in subscription domain and the free-access online abstract databases. This work focuses on automating the above procedure to the maximum extent possible. In particular, we design a download robot that performs query search and abstract retrieval for the candidates of relevant articles over the internet at first stage, followed by fultext retrieval (pdf format), text extraction and a deterministic relevance judgement. As a demonstration, we have also developed a bibliography database for electron-molecule collisions that automatically updates its contents over the internet in regular time intervals. The present work belongs to the project for evolutional data collecting system supported by a JSPS project which involves several research institutes.