The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
The effect of the back button in a random walk: application for pagerank
Proceedings of the 13th international World Wide Web conference on Alternate track papers & posters
Topical web crawlers: Evaluating adaptive algorithms
ACM Transactions on Internet Technology (TOIT)
BackRank: an alternative for PageRank?
WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
Mapping the Semantics of Web Text and Links
IEEE Internet Computing
Ranking web sites with real user traffic
WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Large scale analysis of web revisitation patterns
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
The dynamics of personal territories on the web
Proceedings of the 20th ACM conference on Hypertext and hypermedia
What's in a session: tracking individual behavior on the web
Proceedings of the 20th ACM conference on Hypertext and hypermedia
No search result left behind: branching behavior with browser tabs
Proceedings of the fifth ACM international conference on Web search and data mining
Hi-index | 0.00 |
Analysis has shown that the standard Markovian model of Web navigation is a poor predictor of actual Web traffic. Using empirical data, we characterize several properties of Web traffic that cannot be reproduced with Markovian models but can be explained by an agent-based model that adds several realistic browsing behaviors. First, agents maintain bookmark lists used as teleportation targets. Second, agents can retreat along visited links, a branching mechanism that can reproduce behavior such the back button and tabbed browsing. Finally, agents are sustained by visiting pages of topical interest, with adjacent pages being related. This modulates the production of new sessions, recreating heterogeneous session lengths. The resulting model reproduces individual behaviors from empirical data, reconciling the narrowly focused browsing patterns of individual users with the extreme heterogeneity of aggregate traffic measurements, and leading the way to more sophisticated, realistic, and effective ranking and crawling algorithms.