General purpose database summarization

  • Authors:
  • Régis Saint-Paul;Guillaume Raschia;Noureddine Mouaddib

  • Affiliations:
  • LINA - Polytech'Nantes, France;LINA - Polytech'Nantes, France;LINA - Polytech'Nantes, France

  • Venue:
  • VLDB '05 Proceedings of the 31st international conference on Very large data bases
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, a message-oriented architecture for large database summarization is presented. The summarization system takes a database table as input and produces a reduced version of this table through both a rewriting and a generalization process. The resulting table provides tuples with less precision than the original but yet are very informative of the actual content of the database. This reduced form can be used as input for advanced data mining processes as well as some specific application presented in other works. We describe the incremental maintenance of the summarized table, the system capability to directly deal with XML database systems, and finally scalability which allows it to handle very large datasets of a million record.