CGM: A biomedical text categorization approach using concept graph mining

  • Authors:
  • S. Bleik; Min Song;A. Smalter; Jun Huan;G. Lushington

  • Affiliations:
  • Dept. of Inf. Syst., New Jersey Inst. of Technol., Newark, NJ, USA;Dept. of Inf. Syst., New Jersey Inst. of Technol., Newark, NJ, USA;Dept. of Electr. Eng.&Comput. Sci., Univ. of Kansas, KS, USA;Dept. of Electr. Eng.&Comput. Sci., Univ. of Kansas, KS, USA;Dept. of Electr. Eng.&Comput. Sci., Univ. of Kansas, KS, USA

  • Venue:
  • BIBMW '09 Proceedings of the 2009 IEEE International Conference on Bioinformatics and Biomedicine Workshop
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Text Categorization is used to organize and manage biomedical text databases that are growing at an exponential rate. Feature representations for documents are a crucial factor for the performance of text categorization. Most of the successful existing techniques use a vector representation based on key entities extracted from the text. In this paper we investigate a new direction where we represent a document as a graph. In this representation we identify high level concepts and build a rich graph structure that contains additional concepts and relationships. We then use graph kernel techniques to perform text categorization. The results show a significant improvement in accuracy when compared to categorization based on only the extracted concepts.