Characterizing Genres of Web Pages: Genre Hybridism and Individualization

  • Authors:
  • Marina Santini

  • Affiliations:
  • University of Brighton, UK

  • Venue:
  • HICSS '07 Proceedings of the 40th Annual Hawaii International Conference on System Sciences
  • Year:
  • 2007

Quantified Score

Hi-index 0.01

Visualization

Abstract

When dealing with genres of web pages, there are two important aspects to be taken into account. On the one hand, the web is fluid, unstable and fast-paced. On the other hand, genres on the web are instantiated in web pages, which are a complex type of document, more composite and unpredictable than paper documents. These two aspects are interwoven and often result in classification hurdles. In this paper, I suggest analyzing these classification problems in terms of two broad textual phenomena: genre hybridism and individualization. The identification of these two phenomena helps pinpoint the range of flexibility that an automatic classification system should have. More precisely, genre hybridism accounts for multi-genre variation within the individual web page, while individualization refers to absence of any recognized genre in a web page. In a few words, the aim of this paper is to show that web pages need a zero-to-multi-genre classification scheme, i.e. a scheme that allows zero genre or multi-genre classification, in addition to the traditional single-genre classification.