An exploratory study on the impact of temporal features on the classification and clustering of future-related web documents

Authors:
Ricardo Campos;Gaël Dias;Alípio Jorge
Affiliations:
HULTIG, University of Beira Interior, Covilhã and Polytechnic Institute of Tomar, Tomar and LIAAD - INESC Porto LA, University of Porto, Porto, Portugal;HULTIG, University of Beira Interior, Covilhã, Portugal and DLU/GREYC, Univeristy of Caen Basse-Normandie, Caen, France;LIAAD - INESC Porto LA, University of Porto, Porto, Portugal
Venue:
EPIA'11 Proceedings of the 15th Portugese conference on Progress in artificial intelligence
Year:
2011

Citing 5
Cited 1

ARSA: a sentiment-aware model for predicting sales performance using blogs

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Predicting the News of Tomorrow Using Patterns in Web Search Queries

WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Supporting analysis of future-related information in news archives and the web

Proceedings of the 9th ACM/IEEE-CS joint conference on Digital libraries
Analyzing collective view of future, time-referenced events on the web

Proceedings of the 19th international conference on World wide web
A language modeling approach for temporal information needs

ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval

A survey of temporal web search experience

Proceedings of the 22nd international conference on World Wide Web companion

Quantified Score

Hi-index	0.00

Visualization

Abstract

In the last few years, a huge amount of temporal written information has become widely available on the Internet with the advent of forums, blogs and social networks. This gave rise to a new challenging problem called future retrieval, which consists of extracting future temporal information, that is known in advance, from web sources in order to answer queries that combine text of a future temporal nature. This paper aims to confirm whether web snippets can be used to form an intelligent web that can detect future expected events when their dates are already known. Moreover, the objective is to identify the nature of future texts and understand how these temporal features affect the classification and clustering of the different types of future-related texts: informative texts, scheduled texts and rumor texts. We have conducted a set of comprehensive experiments and the results show that web documents are a valuable source of future data that can be particularly useful in identifying and understanding the future temporal nature of a given implicit temporal query.