Multidimensional text classification for drug information

  • Authors:
  • V. Lertnattee;T. Theeramunkong

  • Affiliations:
  • Sirindhorn Int. Inst. of Technol., Thammasat Univ., Pathumthani, Thailand;-

  • Venue:
  • IEEE Transactions on Information Technology in Biomedicine
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper proposes a multidimensional model for classifying drug information text documents. The concept of multidimensional category model is introduced for representing classes. In contrast with traditional flat and hierarchical category models, the multidimensional category model classifies each document using multiple predefined sets of categories, where each set corresponds to a dimension. Since a multidimensional model can be converted to flat and hierarchical models, three classification approaches are possible, i.e., classifying directly based on the multidimensional model and classifying with the equivalent flat or hierarchical models. The efficiency of these three approaches is investigated using drug information collection with two different dimensions: 1) drug topics and 2) primary therapeutic classes. In the experiments, k-nearest neighbor, naïve Bayes, and two centroid-based methods are selected as classifiers. The comparisons among three approaches of classification are done using two-way analysis of variance, followed by the Scheffe´'s test for post hoc comparison. The experimental results show that multidimensional-based classification performs better than the others, especially in the presence of a relatively small training set. As one application, a category-based search engine using the multidimensional category concept was developed to help users retrieve drug information.