Beyond the bag of words: a text representation for sentence selection

  • Authors:
  • Maria Fernanda Caropreso;Stan Matwin

  • Affiliations:
  • School of Information Technology and Engineering., University of Ottawa, Ottawa, Ontario;School of Information Technology and Engineering., University of Ottawa, Ottawa, Ontario

  • Venue:
  • AI'06 Proceedings of the 19th international conference on Advances in Artificial Intelligence: Canadian Society for Computational Studies of Intelligence
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Sentence selection shares some but not all the characteristics of Automatic Text Categorization. Therefore some but not all the same techniques should be used. In this paper we study a syntactic and semantic enriched text representation for the sentence selection task in a genomics corpus. We show that using technical dictionaries and syntactic relations is beneficial for our problem when using state of the art machine learning algorithms. Furthermore, the syntactic relations can be used by a first order rule learner to obtain even better performance.