Data summarization model for user action log files

  • Authors:
  • Eleonora Gentili;Alfredo Milani;Valentina Poggioni

  • Affiliations:
  • Dipartimento di Matematica e Informatica, Università degli Studi di Perugia, Perugia, Italy;Dipartimento di Matematica e Informatica, Università degli Studi di Perugia, Perugia, Italy;Dipartimento di Matematica e Informatica, Università degli Studi di Perugia, Perugia, Italy

  • Venue:
  • ICCSA'12 Proceedings of the 12th international conference on Computational Science and Its Applications - Volume Part III
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

During last years we have seen an impressive growth and diffusion of applications shared and used by a huge amount of users around the world, like for example social networks, web portals or elearning platforms. Such systems produce in general a large amount of data, normally stored in its raw format in log file systems and databases. To prevent an unmanageable growing of the necessary space to store data and the breakdown of data usability, such data can be condensed and summarized to improve reporting performance and reduce the system load. This data summarization reduces the amount of space that is required to store software data but produces, as a side effect, a decrease of their informative capability due to an information loss. In this work the problem of summarizing data obtained by the log systems of applications with a lot of users is studied. In particular a model to represent these raw data as temporal events collected in time sequences is proposed, methods to reduce the data size, collapsing the descriptions of more events in a unique descriptor or in a smaller set of descriptors, are provided and the optimal summarization problem is posed.