Data preparation for data mining in medical data sets

  • Authors:
  • Grzegorz Ilczuk;Alicja Wakulicz-Deja

  • Affiliations:
  • Siemens AG Medical Solutions, Erlangen, Germany;Institut of Informatics University of Silesia, Sosnowiec, Poland

  • Venue:
  • Transactions on rough sets VI
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data preparation is a very important but also a time consuming part of a Data Mining process. In this paper we describe a hierarchical method of text classification based on regular expressions. We use the presented method in our data mining system during a pre-processing stage to transform Latin free-text medical reports into a decision table. Such decision tables are used as an input for rough sets based rule induction subsystem. In this study we also compare accuracy and scalability of our method with a standard approach based on dictionary phrases.