Synchronizing a database to improve freshness
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
Best-effort cache synchronization with source cooperation
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Estimating frequency of change
ACM Transactions on Internet Technology (TOIT)
Effective page refresh policies for Web crawlers
ACM Transactions on Database Systems (TODS)
Information diffusion through blogspace
Proceedings of the 13th international conference on World Wide Web
WWW '05 Proceedings of the 14th international conference on World Wide Web
Efficient Monitoring Algorithm for Fast News Alerts
IEEE Transactions on Knowledge and Data Engineering
Recrawl scheduling based on information longevity
Proceedings of the 17th international conference on World Wide Web
Approximate Information Filtering in Peer-to-Peer Networks
WISE '08 Proceedings of the 9th international conference on Web Information Systems Engineering
Utilizing RSS Feeds for Crawling the Web
ICIW '09 Proceedings of the 2009 Fourth International Conference on Internet and Web Applications and Services
Best-effort refresh strategies for content-based RSS feed aggregation
WISE'10 Proceedings of the 11th international conference on Web information systems engineering
Characterizing web syndication behavior and content
WISE'11 Proceedings of the 12th international conference on Web information system engineering
Hi-index | 0.00 |
Modern web 2.0 applications have transformed the Internet into an interactive, dynamic and alive information space. Personal weblogs, commercial web sites, news portals and social media applications generate highly dynamic information streams which have to be propagated to millions of users. This article focuses on the problem of estimating the publication frequency of highly dynamic web resources. We illustrate the importance of developing efficient online estimation techniques for improving the refresh strategies of RSS feed aggregators like Google Reader [8], Datasift [7] or Roses [11]. We study the temporal publication characteristics of a large collection of real world RSS feeds and we define and evaluate several online estimation methods in cohesion with different refresh strategies. We show the benefit of using periodical source publication patterns for change estimation and we highlight the challenges imposed by the application context.