Web traffic modeling at finer time scales and performance implications

Authors:
Cathy H. Xia;Zhen Liu;Mark S. Squillante;Li Zhang;Naceur Malouch
Affiliations:
IBM Thomas J. Watson Research Center, Yorktown Heights, NY 10598, USA;IBM Thomas J. Watson Research Center, Yorktown Heights, NY 10598, USA;IBM Thomas J. Watson Research Center, Yorktown Heights, NY 10598, USA;IBM Thomas J. Watson Research Center, Yorktown Heights, NY 10598, USA;Laboratoire LIP6-CNRS, Université Pierre et Marie Curie, 8 rue du capitaine Scott, 75015 Paris, France
Venue:
Performance Evaluation - Long range dependence and heavy tail distributions
Year:
2005

Citing 16
Cited 4

On the self-similar nature of Ethernet traffic (extended version)

IEEE/ACM Transactions on Networking (TON)
Internet Web servers: workload characterization and performance implications

IEEE/ACM Transactions on Networking (TON)
Self-similarity in World Wide Web traffic: evidence and possible causes

IEEE/ACM Transactions on Networking (TON)
Generating representative Web workloads for network and server performance evaluation

SIGMETRICS '98/PERFORMANCE '98 Proceedings of the 1998 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
The changing nature of network traffic: scaling phenomena

ACM SIGCOMM Computer Communication Review
Fitting mixtures of exponentials to long-tail distributions to analyze network performance models

Performance Evaluation
On the relevance of long-range dependence in network traffic

IEEE/ACM Transactions on Networking (TON)
Estimating the heavy tail index from scaling properties

Methodology and Computing in Applied Probability
A methodology for workload characterization of E-commerce sites

Proceedings of the 1st ACM conference on Electronic commerce
Traffic model and performance evaluation of Web servers

Performance Evaluation
Analysis and characterization of large-scale Web server access patterns and performance

World Wide Web
Does fractal scaling at the IP level depend on TCP flow arrival processes?

Proceedings of the 2nd ACM SIGCOMM Workshop on Internet measurment
Queueing systems with long-range dependent input process and subexponential service times

SIGMETRICS '03 Proceedings of the 2003 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
The Structural Cause of File Size Distributions

MASCOTS '01 Proceedings of the Ninth International Symposium in Modeling, Analysis and Simulation of Computer and Telecommunication Systems
Modeling Internet backbone traffic at the flow level

IEEE Transactions on Signal Processing
Wavelet analysis of long-range-dependent traffic

IEEE Transactions on Information Theory

A general model for long-tailed network traffic approximation

The Journal of Supercomputing
Measurement-based optimal resource allocation for network services with pricing differentiation

Performance Evaluation
Web server performance analysis using histogram workload models

Computer Networks: The International Journal of Computer and Telecommunications Networking
Review: A critical look at power law modelling of the Internet

Computer Communications

Quantified Score

Hi-index	0.00

Visualization

Abstract

The performance of Web sites continues to be an important research topic. Such studies are invariably based on the access logs from the servers comprising the Web site. A problem with existing access logs is the coarse granularity of the timestamps, e.g., arrival times. In this study we demonstrate and quantify the significant differences in performance obtained under diverse assumptions about the arrival process of user requests derived from the access logs, where the corresponding user response times can differ by more than an order of magnitude. This motivates the need for a general methodology to construct accurate representations of the actual arrival process of user requests from existing coarse-grained access-log data. Our analysis of the access logs from representative commercial Web sites illustrates self-similar behavior of the arrival process. We propose a drill-down methodology for constructing the arrival process at finer time scales based on the self-similar properties of the arrival process observed at coarse logging time scales. The advantage of our approach is that it maintains consistency between the properties of the arrival processes at both coarser and finer time scales. In addition, our analysis of the request size distribution from commercial Web sites demonstrates a subexponential, but not heavy-tail (power-law) distribution. Through simulations, we investigate the impact of these different traffic models on user response times.