Generating representative Web workloads for network and server performance evaluation
SIGMETRICS '98/PERFORMANCE '98 Proceedings of the 1998 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Enabling dynamic content caching for database-driven web sites
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Analysis of web caching architectures: hierarchical and distributed caching
IEEE/ACM Transactions on Networking (TON)
A survey of web caching schemes for the Internet
ACM SIGCOMM Computer Communication Review
Flash crowds and denial of service attacks: characterization and implications for CDNs and web sites
Proceedings of the 11th international conference on World Wide Web
Crawler Detection: A Bayesian Approach
ICISP '06 Proceedings of the International Conference on Internet Surveillance and Protection
Form-based proxy caching for database-backed web sites: keywords and functions
The VLDB Journal — The International Journal on Very Large Data Bases
Web robot detection: A probabilistic reasoning approach
Computer Networks: The International Journal of Computer and Telecommunications Networking
Caching dynamic web content: designing and analysing an aspect-oriented solution
Proceedings of the ACM/IFIP/USENIX 2006 International Conference on Middleware
An investigation of web crawler behavior: characterization and metrics
Computer Communications
Active cache: caching dynamic contents on the Web
Middleware '98 Proceedings of the IFIP International Conference on Distributed Systems Platforms and Open Distributed Processing
Selection Policy of Rescue Servers Based on Workload Characterization of Flash Crowd
SKG '10 Proceedings of the 2010 Sixth International Conference on Semantics, Knowledge and Grids
Caching personalised and database-related dynamic web pages
International Journal of High Performance Computing and Networking
A Load Reduction System to Mitigate Flash Crowds on Web Server
ISADS '11 Proceedings of the 2011 Tenth International Symposium on Autonomous Decentralized Systems
An extensive study of Web robots traffic
Proceedings of International Conference on Information Integration and Web-based Applications & Services
Hi-index | 0.00 |
Search engines are an essential component of the web, but their web crawling agents can impose a significant burden on heavily loaded web servers. Unfortunately, blocking or deferring web crawler requests is not a viable solution due to economic consequences. We conduct a quantitative measurement study on the impact and cost of web crawling agents, seeking optimization points for this class of request. Based on our measurements, we present a practical caching approach for mitigating search engine overload, and implement the two-level cache scheme on a very busy web server. Our experimental results show that the proposed caching framework can effectively reduce the impact of search engine overload on service quality.