A system for document markup and organisation

  • Authors:
  • Shazia Akhtar;John Dunnion;Ronan G. Reilly

  • Affiliations:
  • University College Dublin, Dublin, Ireland;University College Dublin, Dublin, Ireland;National University of Ireland, Maynooth, Maynooth, County Kildare, Ireland

  • Venue:
  • ISICT '04 Proceedings of the 2004 international symposium on Information and communication technologies
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we present a system for marking up text documents into XML on a Self-Organising Map (SOM). The system organises pre-tagged XML documents on the Self-Organising Map such that the documents similar in content are placed closer to each other. Then, by employing the inductive learning algorithm C5.0, the system learns markup rules from the nearest SOM neighbours of a new unmarked document. Experiments with the system on a number of document corpora demonstrate that our approach is promising.