Efficient online mining of large databases

  • Authors:
  • Fadila Bentayeb;Jerome Darmont;Cecile Favre;Cedric Udrea

  • Affiliations:
  • ERIC, University of Lyon 2, 5 avenue Pierre Mendes-France, 69676 Bron Cedex, France.;ERIC, University of Lyon 2, 5 avenue Pierre Mendes-France, 69676 Bron Cedex, France.;ERIC, University of Lyon 2, 5 avenue Pierre Mendes-France, 69676 Bron Cedex, France.;EURISE, University of St Etienne, 23 rue du Docteur Paul Michelon, 42023 Saint Etienne Cedex 2, France

  • Venue:
  • International Journal of Business Information Systems
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Great efforts have been achieved to apply data mining algorithms onto large databases. However, long processing times remain a practical issue. This paper presents a framework to offer to database users online operators for mining large databases without size limit, in acceptable processing times. First, we integrate decision tree algorithms directly into database management systems. We are thus only limited by disc capacity and not by main memory. However, disc accesses still induce long response times. Hence, we propose two optimisations in a second step: reducing the size of the learning database by building its corresponding contingency table and reducing the number of database accesses by exploiting bitmap indices. Thus, the various decision tree based methods we implemented within Oracle deal with contingency tables or bitmap indices rather than with the whole training set. Experimentations performed show the efficiency of our integrated methods.