Gathering and mining information from web log files

  • Authors:
  • Maristella Agosti;Giorgio Maria Di Nunzio

  • Affiliations:
  • Department of Information Engineering, University of Padua, Italy;Department of Information Engineering, University of Padua, Italy

  • Venue:
  • DELOS'07 Proceedings of the 1st international conference on Digital libraries: research and development
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, a general methodology for gathering and mining information from Web log files is proposed. A series of tools to retrieve, store, and analyze the data extracted from log files have been designed and implemented. The aim is to form general methods by abstracting from the analysis of logs which use a well-defined standard format, such as the Extended Log File Format proposed by W3C. The methodology has been experimented on the Web log files of The European Library portal; the experimental analyses led to personal, technical, geographical and temporal findings about the usage and traffic load. Considerations about a more accurate tracking of users and users profiles, and a better management of crawler accesses using authentication are presented.