Syntactic clustering of the Web
Selected papers from the sixth international conference on World Wide Web
Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
The Evolution of the Web and Implications for an Incremental Crawler
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Characterizing Web Document Change
WAIM '01 Proceedings of the Second International Conference on Advances in Web-Age Information Management
What's new on the web?: the evolution of the web from a search engine perspective
Proceedings of the 13th international conference on World Wide Web
A large-scale study of the evolution of web pages
Software—Practice & Experience - Special issue: Web technologies
An empirical study on the change of web pages
APWeb'05 Proceedings of the 7th Asia-Pacific web conference on Web Technologies Research and Development
The web changes everything: understanding the dynamics of web content
Proceedings of the Second ACM International Conference on Web Search and Data Mining
Resonance on the web: web dynamics and revisitation patterns
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Hi-index | 0.00 |
A number of similarity metrics have been used to measure the degree of web page changes in the literature. In this paper, we define criteria for web page changes to evaluate the effectiveness of the metrics. Using real web pages and synthesized pages, we analyze the five existing metrics (i.e., the byte-wise comparison, the TF∙IDF cosine distance, the word distance, the edit distance, and the shingling) under the proposed criteria. The analysis result can help users select an appropriate metric for particular web applications.