TextLec: a novel method of segmentation by topic using lower windows and lexical cohesion

  • Authors:
  • Laritza Hernández Rojas;José E. Medina Pagola

  • Affiliations:
  • Advanced Technologies Application Centre, C. Habana, Cuba;Advanced Technologies Application Centre, C. Habana, Cuba

  • Venue:
  • CIARP'07 Proceedings of the Congress on pattern recognition 12th Iberoamerican conference on Progress in pattern recognition, image analysis and applications
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

The automatic detection of appropriate subtopic boundaries in a document is a difficult and very useful task in text processing. Some methods have tried to solve this problem, several of them have had favorable results, but they have presented some drawbacks as well. Besides, several of these solutions are application domain dependant. In this work we propose a new algorithm which uses a window below the paragraphs to measure the lexical cohesion to detect subtopics in scientific papers. We compare our method against two algorithms that use the lexical cohesion too. In this comparison we notice that our method has a good performance and outperforms the other two algorithms.