Access patterns for robots and humans in web archives
Proceedings of the 13th ACM/IEEE-CS joint conference on Digital libraries
Hi-index | 0.00 |
Following the widely use of search engines, the impact Web crawlers have on the Web sites should not be ignored. After analyzing the navigational patterns of Web crawlers from Web logs, a new algorithm based on Web page member list is proposed. The algorithm constructs one member list for every Web page and one show table for every visitor. The experiment shows that the new algorithm can detect the unknown crawlers and unfriendly crawlers who do not obey the Standard for Robot Exclusion.