Biomedical text categorization with concept graph representations using a controlled vocabulary

  • Authors:
  • Meenakshi Mishra;Jun Huan;Said Bleik;Min Song

  • Affiliations:
  • University of Kansas;University of Kansas;New Jersey Institute of Technology;Yonsei University

  • Venue:
  • Proceedings of the 11th International Workshop on Data Mining in Bioinformatics
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recent work using graph representations for text categorization has shown promising performance over conventional bag-of-words representation of text documents. In this paper we investigate a graph representation of texts for the task of text categorization. In our representation we identify high level concepts extracted from a database of controlled biomedical terms and build a rich graph structure that contains important concepts and relationships. This procedure ensures that graphs are described with a regular vocabulary, leading to increased ease of comparison. We then classify document graphs by applying a set-based graph kernel that is intuitively sensible and able to deal with the disconnectedness of the constructed concept graphs. We compare this approach to standard approaches using non-graph, text-based features. We also do a comparison amongst different kernels that can be used to see which performs better.