Syntactic clustering of the Web
Selected papers from the sixth international conference on World Wide Web
Improved algorithms for topic distillation in a hyperlinked environment
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
An analysis of Web page and Web site constancy and permanence
Journal of the American Society for Information Science
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
The connectivity server: fast access to linkage information on the Web
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Focused crawling: a new approach to topic-specific Web resource discovery
WWW '99 Proceedings of the eighth international conference on World Wide Web
Authoritative sources in a hyperlinked environment
Journal of the ACM (JACM)
Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
Optimal crawling strategies for web search engines
Proceedings of the 11th international conference on World Wide Web
Proceedings of the 11th international conference on World Wide Web
The Evolution of the Web and Implications for an Incremental Crawler
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Approximating Aggregate Queries about Web Pages via Random Walks
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Using PageRank to Characterize Web Structure
COCOON '02 Proceedings of the 8th Annual International Conference on Computing and Combinatorics
A large-scale study of the evolution of web pages
WWW '03 Proceedings of the 12th international conference on World Wide Web
Stochastic models for the Web graph
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
What's new on the web?: the evolution of the web from a search engine perspective
Proceedings of the 13th international conference on World Wide Web
Rate of change and other metrics: a live study of the world wide web
USITS'97 Proceedings of the USENIX Symposium on Internet Technologies and Systems on USENIX Symposium on Internet Technologies and Systems
Scaling link-based similarity search
WWW '05 Proceedings of the 14th international conference on World Wide Web
The web beyond popularity: a really simple system for web scale RSS
Proceedings of the 15th international conference on World Wide Web
What's really new on the web?: identifying new pages from a series of unstable web snapshots
Proceedings of the 15th international conference on World Wide Web
BuzzRank … and the trend is your friend
Proceedings of the 15th international conference on World Wide Web
Dynamic test collections: measuring search effectiveness on the live web
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Dynamics of the Chilean web structure
Computer Networks: The International Journal of Computer and Telecommunications Networking - Web dynamics
Evaluation of crawling policies for a web-repository crawler
Proceedings of the seventeenth conference on Hypertext and hypermedia
Preferential deletion in dynamic models of web-like networks
Information Processing Letters
Characterization of national Web domains
ACM Transactions on Internet Technology (TOIT)
Factors affecting website reconstruction from the web infrastructure
Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
Practical Algorithms and Lower Bounds for Similarity Search in Massive Graphs
IEEE Transactions on Knowledge and Data Engineering
Using neighbors to date web documents
Proceedings of the 9th annual ACM international workshop on Web information and data management
Proceedings of the 9th annual ACM international workshop on Web information and data management
Recrawl scheduling based on information longevity
Proceedings of the 17th international conference on World Wide Web
Detecting soft errors by redirection classification
Proceedings of the 18th international conference on World wide web
Web spam filtering in internet archives
Proceedings of the 5th International Workshop on Adversarial Information Retrieval on the Web
Bringing your dead links back to life: a comprehensive approach and lessons learned
Proceedings of the 20th ACM conference on Hypertext and hypermedia
The impact of crawl policy on web search effectiveness
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
A method for measuring the evolution of a topic on the Web: The case of “informetrics”
Journal of the American Society for Information Science and Technology
Proceedings of the 18th ACM conference on Information and knowledge management
Stochastic models for tabbed browsing
Proceedings of the 19th international conference on World wide web
SPIRE'07 Proceedings of the 14th international conference on String processing and information retrieval
Freshness matters: in flowers, food, and web authority
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Building a dynamic classifier for large text data collections
ADC '10 Proceedings of the Twenty-First Australasian Conference on Database Technologies - Volume 104
Proceedings of the 2011 ACM Symposium on Applied Computing
Index design and query processing for graph conductance search
The VLDB Journal — The International Journal on Very Large Data Bases
Rediscovering missing web pages using link neighborhood lexical signatures
Proceedings of the 11th annual international ACM/IEEE joint conference on Digital libraries
Towards real intelligent web exploration
APWeb'12 Proceedings of the 14th Asia-Pacific international conference on Web Technologies and Applications
Losing my revolution: how many resources shared on social media have been lost?
TPDL'12 Proceedings of the Second international conference on Theory and Practice of Digital Libraries
TPDL'12 Proceedings of the Second international conference on Theory and Practice of Digital Libraries
Reading the correct history?: modeling temporal intention in resource sharing
Proceedings of the 13th ACM/IEEE-CS joint conference on Digital libraries
First steps in archiving the mobile web: automated discovery of mobile websites
Proceedings of the 13th ACM/IEEE-CS joint conference on Digital libraries
Hi-index | 0.00 |
The rapid growth of the web has been noted and tracked extensively. Recent studies have however documented the dual phenomenon: web pages have small half lives, and thus the web exhibits rapid death as well. Consequently, page creators are faced with an increasingly burdensome task of keeping links up-to-date, and many are falling behind. In addition to just individual pages, collections of pages or even entire neighborhoods of the web exhibit significant decay, rendering them less effective as information resources. Such neighborhoods are identified only by frustrated searchers, seeking a way out of these stale neighborhoods, back to more up-to-date sections of the web; measuring the decay of a page purely on the basis of dead links on the page is too naive to reflect this frustration. In this paper we formalize a strong notion of a decay measure and present algorithms for computing it efficiently. We explore this measure by presenting a number of validations, and use it to identify interesting artifacts on today's web. We then describe a number of applications of such a measure to search engines, web page maintainers, ontologists, and individual users.