Life, death, and lawfulness on the electronic frontier
Proceedings of the ACM SIGCHI Conference on Human factors in computing systems
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Web page change and persistence---a four-year longitudinal study
Journal of the American Society for Information Science and Technology
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
The Evolution of the Web and Implications for an Incremental Crawler
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
A large-scale study of the evolution of web pages
WWW '03 Proceedings of the 12th international conference on World Wide Web
What's new on the web?: the evolution of the web from a search engine perspective
Proceedings of the 13th international conference on World Wide Web
Automatic detection of fragments in dynamically generated web pages
Proceedings of the 13th international conference on World Wide Web
Automation and customization of rendered web pages
Proceedings of the 18th annual ACM symposium on User interface software and technology
Rate of change and other metrics: a live study of the world wide web
USITS'97 Proceedings of the USENIX Symposium on Internet Technologies and Systems on USENIX Symposium on Internet Technologies and Systems
Information re-retrieval: repeat queries in Yahoo's logs
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Large scale analysis of web revisitation patterns
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Recrawl scheduling based on information longevity
Proceedings of the 17th international conference on World Wide Web
Zoetrope: interacting with the ephemeral web
Proceedings of the 21st annual ACM symposium on User interface software and technology
An empirical study on the change of web pages
APWeb'05 Proceedings of the 7th Asia-Pacific web conference on Web Technologies Research and Development
Effective criteria for web page changes
APWeb'06 Proceedings of the 8th Asia-Pacific Web conference on Frontiers of WWW Research and Development
Resonance on the web: web dynamics and revisitation patterns
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
A method for measuring the evolution of a topic on the Web: The case of “informetrics”
Journal of the American Society for Information Science and Technology
SHARC: framework for quality-conscious web archiving
Proceedings of the VLDB Endowment
Leveraging temporal dynamics of document content in relevance ranking
Proceedings of the third ACM international conference on Web search and data mining
Foundations and Trends in Information Retrieval
A longitudinal study of how highlighting web content change affects people's web interactions
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
No Code Required: Giving Users Tools to Transform the Web
No Code Required: Giving Users Tools to Transform the Web
Proceedings of the 21st ACM conference on Hypertext and hypermedia
Freshness matters: in flowers, food, and web authority
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Term frequency dynamics in collaborative articles
Proceedings of the 10th ACM symposium on Document engineering
Scale-adaptable recrawl strategies for DHT-based distributed web crawling system
NPC'10 Proceedings of the 2010 IFIP international conference on Network and parallel computing
Understanding temporal query dynamics
Proceedings of the fourth ACM international conference on Web search and data mining
The SHARC framework for data quality in Web archiving
The VLDB Journal — The International Journal on Very Large Data Bases
An analysis of time-instability in web search results
ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Detecting and exploiting stability in evolving heterogeneous information spaces
Proceedings of the 11th annual international ACM/IEEE joint conference on Digital libraries
Archiving the web using page changes patterns: a case study
Proceedings of the 11th annual international ACM/IEEE joint conference on Digital libraries
Estimation methods for ranking recent information
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Temporal index sharding for space-time efficiency in archive search
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Improving the quality of web archives through the importance of changes
DEXA'11 Proceedings of the 22nd international conference on Database and expert systems applications - Volume Part I
Discovering URLs through user feedback
Proceedings of the 20th ACM international conference on Information and knowledge management
Using control theory for stable and efficient recommender systems
Proceedings of the 21st international conference on World Wide Web
XCC: change control of XML documents
Computer Science - Research and Development
Coevolution of network structure and content
Proceedings of the 3rd Annual ACM Web Science Conference
Predicting content change on the web
Proceedings of the sixth ACM international conference on Web search and data mining
Temporal web dynamics and its application to information retrieval
Proceedings of the sixth ACM international conference on Web search and data mining
Effects of Information Filters: A Phenomenon on the Web
International Journal of Information Retrieval Research
Reading the correct history?: modeling temporal intention in resource sharing
Proceedings of the 13th ACM/IEEE-CS joint conference on Digital libraries
A modelling framework for social media monitoring
International Journal of Web Engineering and Technology
Hi-index | 0.01 |
The Web is a dynamic, ever changing collection of information. This paper explores changes in Web content by analyzing a crawl of 55,000 Web pages, selected to represent different user visitation patterns. Although change over long intervals has been explored on random (and potentially unvisited) samples of Web pages, little is known about the nature of finer grained changes to pages that are actively consumed by users, such as those in our sample. We describe algorithms, analyses, and models for characterizing changes in Web content, focusing on both time (by using hourly and sub-hourly crawls) and structure (by looking at page-, DOM-, and term-level changes). Change rates are higher in our behavior-based sample than found in previous work on randomly sampled pages, with a large portion of pages changing more than hourly. Detailed content and structure analyses identify stable and dynamic content within each page. The understanding of Web change we develop in this paper has implications for tools designed to help people interact with dynamic Web content, such as search engines, advertising, and Web browsers.