Automatic genre identification: towards a flexible classification scheme

  • Authors:
  • Marina Santini

  • Affiliations:
  • University of Brighton, Lewes Road, Brighton, UK

  • Venue:
  • FDIA'07 Proceedings of the 1st BCS IRSG conference on Future Directions in Information Access
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents an automatic genre classification model that implements a flexible classification scheme, i.e. a scheme capable of performing zero-, one- or multi-genre assignment. I suggest that this scheme is more appropriate for genres on the web, because many web pages have often more than one genre or none at all. The model that I propose relies on the distinction between the concepts of 'text types' and 'genre', which are both 'inferred' and not 'learned' from pre-labelled examples. The main drawback of this approach is that it cannot be fully evaluated given the limitations of current genre research. However, I present a partial evaluation that shows that the model performs competitively, and remains stable when re-scaled.