Exploring temporal evidence in web information retrieval

Authors:
Sérgio Nunes
Affiliations:
Faculdade de Engenharia, Universidade do Porto, Porto, Portugal
Venue:
FDIA'07 Proceedings of the 1st BCS IRSG conference on Future Directions in Information Access
Year:
2007

Citing 15
Cited 5

Syntactic clustering of the Web

Selected papers from the sixth international conference on World Wide Web
The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
How dynamic is the Web?

Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
The Evolution of the Web and Implications for an Incremental Crawler

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
What's new on the web?: the evolution of the web from a search engine perspective

Proceedings of the 13th international conference on World Wide Web
A large-scale study of the evolution of web pages

Software—Practice & Experience - Special issue: Web technologies
On the temporal dimension of search

Proceedings of the 13th international World Wide Web conference on Alternate track papers & posters
Trend detection through temporal link analysis

Journal of the American Society for Information Science and Technology - Special issue: Webometrics
Modelling information persistence on the web

ICWE '06 Proceedings of the 6th international conference on Web engineering
Finding near-duplicate web pages: a large-scale evaluation of algorithms

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Stanford WebBase components and applications

ACM Transactions on Internet Technology (TOIT)
Temporal multi-page summarization

Web Intelligence and Agent Systems
Google news personalization: scalable online collaborative filtering

Proceedings of the 16th international conference on World Wide Web
Extending a web browser with client-side mining

APWeb'03 Proceedings of the 5th Asia-Pacific web conference on Web technologies and applications
Honto? search: estimating trustworthiness of web information by search results aggregation and temporal analysis

APWeb/WAIM'07 Proceedings of the joint 9th Asia-Pacific web and 8th international conference on web-age information management conference on Advances in data and web management

Exploiting temporal contexts in text classification

Proceedings of the 17th ACM conference on Information and knowledge management
Vetting the links of the web

Proceedings of the 18th ACM conference on Information and knowledge management
Towards recency ranking in web search

Proceedings of the third ACM international conference on Web search and data mining
Learning recurrent event queries for web search

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Handling temporal information in web search engines

ACM SIGMOD Record

Quantified Score

Hi-index	0.00

Visualization

Abstract

Web Information Retrieval (WebIR) is the application of Information Retrieval concepts to the World Wide Web. The most successful approaches in this field have modeled the web's structure as a directed graph and explored this concept using different approaches. Within this line of research, HITS and PageRank are two of the most well known paradigms for evaluating the importance of web documents. Most of this research has origins in the area of citation analysis, but although time is an important dimension in the citation analysis literature, it hasn't been explored in depth within WebIR. Recent studies show that the web is a highly dynamic environment, with significant changes occurring weekly. The Blogospace is a good example of this very active behavior. In this work, temporal web evidence is identified and categorized according to two classes, one based on features extracted form individual documents and the other based on features extracted from the whole web. Also, a broad survey of previous work exploring temporal evidence is presented. Finally, ideas for exploring temporal web evidence in typical web tasks are briefly discussed. The lack of suitable corpora containing temporal evidence has been a deterrent to research on this field. The recent availability of public datasets containing temporal information has raised public awareness of this topic.