The Automatic Evaluation of Website Metrics and State
International Journal of Web-Based Learning and Teaching Technologies
Services-oriented URL filtering and verification
International Journal of High Performance Systems Architecture
Hi-index | 0.01 |
For large-scale Web information collection, the URL filter module plays important roles in a Web crawler which is a central component of a search engine. The performance of an URL filter module influents the efficiency of the entire collection system directly. This paper introduces one URL filter algorithm based on caching and its implementation. The performances of stability and paralleling of the algorithm are verified by the experiments for Websites which handle a large number of web pages. Experiment results show the algorithm proposed in this paper can achieve satisfactory performances through reasonable adjustments of its some parameters and it is suitable for the process of the URL filter of a Website which has a number of page navigator links and index pages especially.