Comparative study of classification techniques on biomedical data from hypertext documents

  • Authors:
  • Rashedur M. Rahman;Sazia Salahuddin

  • Affiliations:
  • Department of Electrical Engineering and Computer Science, North South University, Plot-15, Block-B, Bashundhara, Dhaka 1229, Bangladesh;Department of Electrical Engineering and Computer Science, North South University, Plot-15, Block-B, Bashundhara, Dhaka 1229, Bangladesh

  • Venue:
  • International Journal of Knowledge Engineering and Soft Data Paradigms
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, our goal is to mine biomedical data from hypertext documents e.g., mining data from web contents using data mining algorithms with the help of 'biomedical ontology'. We collect a number of documents using Google and preprocess the hypertext documents and extract the text data. Next job is the identification of biomedical data. To identify whether a word is a biomedical entity or not we use a biomedical database, the 'UMLS metathesaurus'. The mapping of biomedical entity from the metathesaurus will be done based on keyword query. The more occurrence of a biomedical entity in a page, the more relevant the page is, and thus, we can re-rank the documents to find the most important documents. Then we test and analyse the performance of seven most popular classification algorithms by training them separately with the documents ranked by Google and our algorithm.