Authoritative sources in a hyperlinked environment
Journal of the ACM (JACM)
Workload characterization of a Web proxy in a cable modem environment
ACM SIGMETRICS Performance Evaluation Review
Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
ACM Transactions on Internet Technology (TOIT)
Summary of WWW characterizations
World Wide Web
Mercator: A scalable, extensible Web crawler
World Wide Web
Using PageRank to Characterize Web Structure
COCOON '02 Proceedings of the 8th Annual International Conference on Computing and Combinatorics
Who Links to Whom: Mining Linkage between Web Sites
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
CoBWeb A Crawler for the Brazilian Web
SPIRE '99 Proceedings of the String Processing and Information Retrieval Symposium & International Workshop on Groupware
Compressing the Graph Structure of the Web
DCC '01 Proceedings of the Data Compression Conference
The Structural Cause of File Size Distributions
MASCOTS '01 Proceedings of the Ninth International Symposium in Modeling, Analysis and Simulation of Computer and Telecommunication Systems
Evolution of the Chilean Web Structure Composition
LA-WEB '03 Proceedings of the First Conference on Latin American Web Congress
Proceedings of the 13th international conference on World Wide Web
Sic transit gloria telae: towards an understanding of the web's decay
Proceedings of the 13th international conference on World Wide Web
Spam, damn spam, and statistics: using statistical analysis to locate spam web pages
Proceedings of the 7th International Workshop on the Web and Databases: colocated with ACM SIGMOD/PODS 2004
UbiCrawler: a scalable fully distributed web crawler
Software—Practice & Experience
Toward a basic framework for webometrics
Journal of the American Society for Information Science and Technology - Special issue: Webometrics
Characterizing a national community web
ACM Transactions on Internet Technology (TOIT)
The web as a graph: measurements, models, and methods
COCOON'99 Proceedings of the 5th annual international conference on Computing and combinatorics
On the image content of a web segment: Chile as a case study
Journal of Web Engineering
Temporal Analysis of the Wikigraph
WI '06 Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence
Genealogical trees on the web: a search engine user perspective
Proceedings of the 17th international conference on World Wide Web
Recovering a website's server components from the web infrastructure
Proceedings of the 8th ACM/IEEE-CS joint conference on Digital libraries
Usage analysis of a public website reconstruction tool
Proceedings of the 8th ACM/IEEE-CS joint conference on Digital libraries
A Quantitative Evaluation of Dissemination-Time Preservation Metadata
ECDL '08 Proceedings of the 12th European conference on Research and Advanced Technology for Digital Libraries
Characterization of the evolution of a news Web site
Journal of Systems and Software
Query selection for improved Greek web searches
Proceedings of the 2nd ACM workshop on Improving non english web searching
Workload Characterization of a Large Systems Conference Web Server
CNSR '09 Proceedings of the 2009 Seventh Annual Communication Networks and Services Research Conference
How are web characteristics evolving?
Proceedings of the 20th ACM conference on Hypertext and hypermedia
The Geographical Life of Search
WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Host-IP clustering technique for deep web characterization
Proceedings of the 2010 ACM Symposium on Applied Computing
Determining factors behind the PageRank log-log plot
WAW'07 Proceedings of the 5th international conference on Algorithms and models for the web-graph
Macroscopic characterisations of Web accessibility
The New Review of Hypermedia and Multimedia - Web Accessibility
Journal of Web Engineering
ACM Transactions on the Web (TWEB)
Sampling the national deep web
DEXA'11 Proceedings of the 22nd international conference on Database and expert systems applications - Volume Part I
Databases on the web: national web domain survey
Proceedings of the 15th Symposium on International Database Engineering & Applications
An evolutionary factor analysis computation for mining website structures
Expert Systems with Applications: An International Journal
On estimating the scale of national deep web
DEXA'07 Proceedings of the 18th international conference on Database and Expert Systems Applications
Evaluating web archive search systems
WISE'12 Proceedings of the 13th international conference on Web Information Systems Engineering
MICAI'12 Proceedings of the 11th Mexican international conference on Advances in Artificial Intelligence - Volume Part I
Hi-index | 0.00 |
During the last few years, several studies on the characterization of the public Web space of various national domains have been published. The pages of a country are an interesting set for studying the characteristics of the Web because at the same time these are diverse (as they are written by several authors) and yet rather similar (as they share a common geographical, historical and cultural context). This article discusses the methodologies used for presenting the results of Web characterization studies, including the granularity at which different aspects are presented, and a separation of concerns between contents, links, and technologies. Based on this, we present a side-by-side comparison of the results of 12 Web characterization studies, comprising over 120 million pages from 24 countries. The comparison unveils similarities and differences between the collections and sheds light on how certain results of a single Web characterization study on a sample may be valid in the context of the full Web.