An exploratory study on the impact of temporal features on the classification and clustering of future-related web documents

  • Authors:
  • Ricardo Campos;Gaël Dias;Alípio Jorge

  • Affiliations:
  • HULTIG, University of Beira Interior, Covilhã and Polytechnic Institute of Tomar, Tomar and LIAAD - INESC Porto LA, University of Porto, Porto, Portugal;HULTIG, University of Beira Interior, Covilhã, Portugal and DLU/GREYC, Univeristy of Caen Basse-Normandie, Caen, France;LIAAD - INESC Porto LA, University of Porto, Porto, Portugal

  • Venue:
  • EPIA'11 Proceedings of the 15th Portugese conference on Progress in artificial intelligence
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

In the last few years, a huge amount of temporal written information has become widely available on the Internet with the advent of forums, blogs and social networks. This gave rise to a new challenging problem called future retrieval, which consists of extracting future temporal information, that is known in advance, from web sources in order to answer queries that combine text of a future temporal nature. This paper aims to confirm whether web snippets can be used to form an intelligent web that can detect future expected events when their dates are already known. Moreover, the objective is to identify the nature of future texts and understand how these temporal features affect the classification and clustering of the different types of future-related texts: informative texts, scheduled texts and rumor texts. We have conducted a set of comprehensive experiments and the results show that web documents are a valuable source of future data that can be particularly useful in identifying and understanding the future temporal nature of a given implicit temporal query.