Top_Keyword: An Aggregation Function for Textual Document OLAP

  • Authors:
  • Franck Ravat;Olivier Teste;Ronan Tournier;Gilles Zurfluh

  • Affiliations:
  • IRIT (UMR5505), Université de Toulouse, Toulouse Cedex 9, France F-31062;IRIT (UMR5505), Université de Toulouse, Toulouse Cedex 9, France F-31062;IRIT (UMR5505), Université de Toulouse, Toulouse Cedex 9, France F-31062;IRIT (UMR5505), Université de Toulouse, Toulouse Cedex 9, France F-31062

  • Venue:
  • DaWaK '08 Proceedings of the 10th international conference on Data Warehousing and Knowledge Discovery
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

For more than a decade, researches on OLAP and multidimensional databases have generated methodologies, tools and resource management systems for the analysis of numeric data. With the growing availability of digital documents, there is a need for incorporating text-rich documents within multidimensional databases as well as an adapted framework for their analysis. This paper presents a new aggregation function that aggregates textual data in an OLAP environment. The Top_Keywordfunction (Top_Kwfor short) represents a set of documents by their most significant terms using a weighing function from information retrieval: tf.idf.