Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
The Evolution of the Web and Implications for an Incremental Crawler
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Web Structure, Dynamics and Page Quality
SPIRE 2002 Proceedings of the 9th International Symposium on String Processing and Information Retrieval
Refinement of TF-IDF schemes for web pages using their hyperlinked neighboring pages
Proceedings of the fourteenth ACM conference on Hypertext and hypermedia
What's new on the web?: the evolution of the web from a search engine perspective
Proceedings of the 13th international conference on World Wide Web
Sic transit gloria telae: towards an understanding of the web's decay
Proceedings of the 13th international conference on World Wide Web
Characterization of a large web site population with implications for content delivery
Proceedings of the 13th international conference on World Wide Web
A large-scale study of the evolution of web pages
Software—Practice & Experience - Special issue: Web technologies
Local methods for estimating pagerank values
Proceedings of the thirteenth ACM international conference on Information and knowledge management
Trend detection through temporal link analysis
Journal of the American Society for Information Science and Technology - Special issue: Webometrics
Random sampling from a search engine's index
Proceedings of the 15th international conference on World Wide Web
Modelling information persistence on the web
ICWE '06 Proceedings of the 6th international conference on Web engineering
Temporal multi-page summarization
Web Intelligence and Agent Systems
Agreeing to disagree: search engines and their public interfaces
Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
Rate of change and other metrics: a live study of the world wide web
USITS'97 Proceedings of the USENIX Symposium on Internet Technologies and Systems on USENIX Symposium on Internet Technologies and Systems
Proceedings of the 9th annual ACM international workshop on Web information and data management
Web page publication time detection and its application for page rank
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Noise robust detection of the emergence and spread of topics on the web
Proceedings of the 2nd Temporal Web Analytics Workshop
Hi-index | 0.00 |
Time has been successfully used as a feature in web information retrieval tasks. In this context, estimating a document's inception date or last update date is a necessary task. Classic approaches have used HTTP header fields to estimate a document's last update time. The main problem with this approach is that it is applicable to a small part of web documents. In this work, we evaluate an alternative strategy based on a document's neighborhood. Using a random sample containing 10,000 URLs from the Yahoo! Directory, we study each document's links and media assets to determine its age. If we only consider isolated documents, we are able to date 52% of them. Including the document's neighborhood, we are able to estimate the date of more than 86% of the same sample. Also, we find that estimates differ significantly according to the type of neighbors used. The most reliable estimates are based on the document's media assets, while the worst estimates are based on incoming links. These results are experimentally evaluated with a real world application using different datasets.