Potential benefits of delta encoding and data compression for HTTP
SIGCOMM '97 Proceedings of the ACM SIGCOMM '97 conference on Applications, technologies, architectures, and protocols for computer communication
Size-estimation framework with applications to transitive closure and reachability
Journal of Computer and System Sciences
Syntactic clustering of the Web
Selected papers from the sixth international conference on World Wide Web
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Authoritative sources in a hyperlinked environment
Journal of the ACM (JACM)
Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
On network-aware clustering of Web clients
Proceedings of the conference on Applications, Technologies, Architectures, and Protocols for Computer Communication
An adaptive model for optimizing performance of an incremental web crawler
Proceedings of the 10th international conference on World Wide Web
The Evolution of the Web and Implications for an Incremental Crawler
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
On the bursty evolution of blogspace
WWW '03 Proceedings of the 12th international conference on World Wide Web
A large-scale study of the evolution of web pages
Software—Practice & Experience - Special issue: Web technologies
PRO-COW: Protocol compliance on the web-a longitudinal study
USITS'01 Proceedings of the 3rd conference on USENIX Symposium on Internet Technologies and Systems - Volume 3
Rate of change and other metrics: a live study of the world wide web
USITS'97 Proceedings of the USENIX Symposium on Internet Technologies and Systems on USENIX Symposium on Internet Technologies and Systems
BlogRank: ranking weblogs based on connectivity and similarity features
AAA-IDEA '06 Proceedings of the 2nd international workshop on Advanced architectures and algorithms for internet delivery and applications
Proceedings of the 10th Annual International Conference on Digital Government Research: Social Networks: Making Connections between Citizens, Data and Government
Hi-index | 0.00 |
The increasingly prominent new subset of Web pages, called 'blogs' differs from traditional Web pages both in characteristics and potential to applications. We explore three aspects of the blogistan: its overall scope and size, identification of emerging hot topics of discussion and link patterns, and implications both to blogs and applications such as search. Beyond blogs, we develop a general methodology of mining evolving networks and connections. The first part of our study is longitudinal-based on a five-week continuous fetch of a seed collection of nearly 10,000 blog URLs. The second part is based on a successive crawl of pages suspected to be blogs leading to a larger collection of several million URLs. The collection is examined for a variety of properties. We characterize blogs and study different facets of the link structure in blogs and its evolution over time, attributes of servers and domains that host many of the blogs including their IP addresses, and how blogs behave with respect to various HTTP/1.1 protocol issues. Inferences from our in-depth exploration are relevant to applications ranging from mining to hosting of blogs and other issues of relevance to the measurement community.