The SIFT information dissemination system
ACM Transactions on Database Systems (TODS)
Synchronizing a database to improve freshness
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
NiagaraCQ: a scalable continuous query system for Internet databases
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
An adaptive model for optimizing performance of an incremental web crawler
Proceedings of the 10th international conference on World Wide Web
Adaptive push-pull: disseminating dynamic web data
Proceedings of the 10th international conference on World Wide Web
Filtering algorithms and implementation for very fast publish/subscribe systems
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Managing periodically updated data in relational databases: a stochastic modeling approach
Journal of the ACM (JACM)
Proceedings of the 11th international conference on World Wide Web
Optimal crawling strategies for web search engines
Proceedings of the 11th international conference on World Wide Web
Best-effort cache synchronization with source cooperation
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Continual Queries for Internet Scale Event-Driven Information Delivery
IEEE Transactions on Knowledge and Data Engineering
Efficient Filtering of XML Documents for Selective Dissemination of Information
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Optimizing search engines using clickthrough data
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Predictive caching and prefetching of query results in search engines
WWW '03 Proceedings of the 12th international conference on World Wide Web
Monitoring the dynamic web to respond to continuous queries
WWW '03 Proceedings of the 12th international conference on World Wide Web
Efficient URL caching for world wide web crawling
WWW '03 Proceedings of the 12th international conference on World Wide Web
Estimating frequency of change
ACM Transactions on Internet Technology (TOIT)
Evaluating different methods of estimating retrieval quality for resource selection
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Relevant document distribution estimation method for resource selection
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Effective page refresh policies for Web crawlers
ACM Transactions on Database Systems (TODS)
Information diffusion through blogspace
Proceedings of the 13th international conference on World Wide Web
WWW '05 Proceedings of the 14th international conference on World Wide Web
A utility theoretic approach to determining optimal wait times in distributed information retrieval
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Publish/subscribe functionality in IR environments using structured overlay networks
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Semantic search via XML fragments: a high-precision approach to IR
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Effective change detection using sampling
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
An efficient and resilient approach to filtering and disseminating streaming data
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Federated search of text-based digital libraries in hierarchical peer-to-peer networks
ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research
A new aggregation policy for RSS services
Proceedings of the 2008 international workshop on Context enabled source and service selection, integration and adaptation: organized with the 17th International World Wide Web Conference (WWW 2008)
Maintaining dynamic channel profiles on the web
Proceedings of the VLDB Endowment
An IMS Based Mobile Podcasting Architecture Supporting Multicast/Broadcast Delivery
Principles, Systems and Applications of IP Telecommunications. Services and Security for Next Generation Networks
Providing adaptive support in computer supported collaboration environments
SMC'09 Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics
Optimizing content freshness of relations extracted from the web using keyword search
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
On trade-offs in event delivery systems
Proceedings of the Fourth ACM International Conference on Distributed Event-Based Systems
Towards a quality-oriented real-time web crawler
WISM'10 Proceedings of the 2010 international conference on Web information systems and mining
Best-effort refresh strategies for content-based RSS feed aggregation
WISE'10 Proceedings of the 11th international conference on Web information systems engineering
Archiving the web using page changes patterns: a case study
Proceedings of the 11th annual international ACM/IEEE joint conference on Digital libraries
Improving the quality of web archives through the importance of changes
DEXA'11 Proceedings of the 22nd international conference on Database and expert systems applications - Volume Part I
Reclaiming the blogosphere, talkback: a secure linkback protocol for weblogs
ESORICS'11 Proceedings of the 16th European conference on Research in computer security
Characterizing web syndication behavior and content
WISE'11 Proceedings of the 12th international conference on Web information system engineering
Feeding the world: a comprehensive dataset and analysis of a real world snapshot of web feeds
Proceedings of the 13th International Conference on Information Integration and Web-based Applications and Services
ICWE'12 Proceedings of the 12th international conference on Web Engineering
Hi-index | 0.00 |
Recently, there has been a dramatic increase in the use of XML data to deliver information over the Web. Personal Weblogs, news Web sites, and discussion forums are now publishing RSS feeds for their subscribers to retrieve new postings. As the popularity of personal Weblogs and RSS feeds grows rapidly, RSS aggregation services and blog search engines have appeared, which try to provide a central access point for simpler access and discovery of new content from a large number of diverse RSS sources. In this paper, we study how the RSS aggregation services should monitor the data sources to retrieve new content quickly using minimal resources and to provide its subscribers with fast news alerts. We believe that the change characteristics of RSS sources and the general user access behavior pose distinct requirements that make this task significantly different from the traditional index refresh problem for Web search engines. Our studies on a collection of 10,000 RSS feeds reveal some general characteristics of the RSS feeds and show that, with proper resource allocation and scheduling, the RSS aggregator provides news alerts significantly faster than the best existing approach.