The connectivity server: fast access to linkage information on the Web
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Authoritative sources in a hyperlinked environment
Journal of the ACM (JACM)
WebBase: a repository of Web pages
Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
Compact representations of separable graphs
SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
Towards Compressing Web Graphs
DCC '01 Proceedings of the Data Compression Conference
Compressing the Graph Structure of the Web
DCC '01 Proceedings of the Data Compression Conference
PageRank as a function of the damping factor
WWW '05 Proceedings of the 14th international conference on World Wide Web
TotalRank: ranking without damping
WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
WebGraph: things you thought you could not do with Java™
Proceedings of the 3rd international symposium on Principles and practice of programming in Java
Distributed PageRank computation based on iterative aggregation-disaggregation methods
Proceedings of the 14th ACM international conference on Information and knowledge management
To randomize or not to randomize: space optimal summaries for hyperlink analysis
Proceedings of the 15th international conference on World Wide Web
Accelerating sparse matrix computations via data compression
Proceedings of the 20th annual international conference on Supercomputing
A reference collection for web spam
ACM SIGIR Forum
The Web as a graph: How far we are
ACM Transactions on Internet Technology (TOIT)
On the peninsula phenomenon in web graph and its implications on web search
Computer Networks: The International Journal of Computer and Telecommunications Networking
Extraction and classification of dense communities in the web
Proceedings of the 16th international conference on World Wide Web
Decoding the structure of the WWW: A comparative analysis of Web crawls
ACM Transactions on the Web (TWEB)
Perfect hash functions for large dictionaries
Proceedings of the ACM first workshop on CyberInfrastructure: information management in eScience
External perfect hashing for very large key sets
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
A scalable pattern mining approach to web graph compression with communities
WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Graph summarization with bounded error
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Efficient aggregation for graph summarization
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Efficient semi-streaming algorithms for local triangle counting in massive graphs
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient Compression of Web Graphs
COCOON '08 Proceedings of the 14th annual international conference on Computing and Combinatorics
Efficiently Handling Dynamics in Distributed Link Based Authority Analysis
WISE '08 Proceedings of the 9th international conference on Web Information Systems Engineering
Main-memory triangle computations for very large (sparse (power-law)) graphs
Theoretical Computer Science
Searching the wikipedia with contextual information
Proceedings of the 17th ACM conference on Information and knowledge management
Distributed perfect hashing for very large key sets
Proceedings of the 3rd international conference on Scalable information systems
ACM SIGIR Forum
Compressed collections for simulated crawling
ACM SIGIR Forum
Speeding up algorithms on compressed web graphs
Proceedings of the Second ACM International Conference on Web Search and Data Mining
Information Theoretic Comparison of Stochastic Graph Models: Some Experiments
WAW '09 Proceedings of the 6th International Workshop on Algorithms and Models for the Web-Graph
Choose the Damping, Choose the Ranking?
WAW '09 Proceedings of the 6th International Workshop on Algorithms and Models for the Web-Graph
Characterization of Tail Dependence for In-Degree and PageRank
WAW '09 Proceedings of the 6th International Workshop on Algorithms and Models for the Web-Graph
WAW '09 Proceedings of the 6th International Workshop on Algorithms and Models for the Web-Graph
Query suggestions using query-flow graphs
Proceedings of the 2009 workshop on Web Search Click Data
Extraction and classification of dense implicit communities in the Web graph
ACM Transactions on the Web (TWEB)
Rank and Select for Succinct Data Structures
Electronic Notes in Theoretical Computer Science (ENTCS)
Proceedings of the 18th international conference on World wide web
Proceedings of the 9th ACM/IEEE-CS joint conference on Digital libraries
On compressing social networks
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Proceedings of the 20th ACM conference on Hypertext and hypermedia
k2-Trees for Compact Web Graph Representation
SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
PageRank: Functional dependencies
ACM Transactions on Information Systems (TOIS)
Graph OLAP: a multi-dimensional framework for graph data analysis
Knowledge and Information Systems
Eigenvectors of directed graphs and importance scores: dominance, T-Rank, and sink remedies
Data Mining and Knowledge Discovery
On compressing the textual web
Proceedings of the third ACM international conference on Web search and data mining
ACM Transactions on Information Systems (TOIS)
Choose the damping, choose the ranking?
Journal of Discrete Algorithms
Sorting out the document identifier assignment problem
ECIR'07 Proceedings of the 29th European conference on IR research
RDF compression: basic approaches
Proceedings of the 19th international conference on World wide web
A fast and compact web graph representation
SPIRE'07 Proceedings of the 14th international conference on String processing and information retrieval
Analysis of link graph compression techniques
ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
Mining Query Logs: Turning Search Usage Data into Knowledge
Foundations and Trends in Information Retrieval
Stateful bulk processing for incremental analytics
Proceedings of the 1st ACM symposium on Cloud computing
A compact representation of graph databases
Proceedings of the Eighth Workshop on Mining and Learning with Graphs
Neighbor query friendly compression of social networks
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient algorithms for large-scale local triangle counting
ACM Transactions on Knowledge Discovery from Data (TKDD)
Fast and Compact Web Graph Representations
ACM Transactions on the Web (TWEB)
Graph structures and algorithms for query-log analysis
CiE'10 Proceedings of the Programs, proofs, process and 6th international conference on Computability in Europe
C&C: an effective algorithm for extracting web community cores
DASFAA'10 Proceedings of the 15th international conference on Database systems for advanced applications
Multithreaded Asynchronous Graph Traversal for In-Memory and Semi-External Memory
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Coniunge et impera: multiple-graph mining for query-log analysis
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I
Finding the diameter in real-world graphs experimentally turning a lower bound into an upper bound
ESA'10 Proceedings of the 18th annual European conference on Algorithms: Part I
Piccolo: building fast, distributed programs with partitioned tables
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
An algorithmic treatment of strong queries
Proceedings of the fourth ACM international conference on Web search and data mining
The effects of time on query flow graph-based models for query suggestion
RIAO '10 Adaptivity, Personalization and Fusion of Heterogeneous Information
Compact representation of large RDF data sets for publishing and exchange
ISWC'10 Proceedings of the 9th international semantic web conference on The semantic web - Volume Part I
Multiscale approach for the network compression-friendly ordering
Journal of Discrete Algorithms
An Inner-Outer Iteration for Computing PageRank
SIAM Journal on Scientific Computing
Proceedings of the 20th international conference on World wide web
HyperANF: approximating the neighbourhood function of very large graphs on a budget
Proceedings of the 20th international conference on World wide web
A comparison of three algorithms for approximating the distance distribution in real-world graphs
TAPAS'11 Proceedings of the First international ICST conference on Theory and practice of algorithms in (computer) systems
Neighborhood based fast graph search in large networks
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Efficient topological OLAP on information networks
DASFAA'11 Proceedings of the 16th international conference on Database systems for advanced applications - Volume Part I
Query reformulation mining: models, patterns, and applications
Information Retrieval
Compressed string dictionaries
SEA'11 Proceedings of the 10th international conference on Experimental algorithms
Compression of weighted graphs
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
GBASE: a scalable and general graph management system
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Quick detection of top-k personalized pagerank lists
WAW'11 Proceedings of the 8th international conference on Algorithms and models for the web graph
Using patterns in the behavior of the random surfer to detect webspam beneficiaries
WISS'10 Proceedings of the 2010 international conference on Web information systems engineering
Detection of web communities from community cores
WISS'10 Proceedings of the 2010 international conference on Web information systems engineering
A scalable eigensolver for large scale-free graphs using 2D graph partitioning
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Parallel breadth-first search on distributed memory systems
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Automatic Assessment of Document Quality in Web Collaborative Digital Libraries
Journal of Data and Information Quality (JDIQ)
Local computation of PageRank: the ranking side
Proceedings of the 20th ACM international conference on Information and knowledge management
Practical representations for web and social graphs
Proceedings of the 20th ACM international conference on Information and knowledge management
Webspam demotion: Low complexity node aggregation methods
Neurocomputing
Scalable manipulation of archival web graphs
Proceedings of the 9th workshop on Large-scale and distributed informational retrieval
gSketch: on query estimation in graph streams
Proceedings of the VLDB Endowment
Optimizing K2 trees: A case for validating the maturity of network of practices
Computers & Mathematics with Applications
Of hammers and nails: an empirical comparison of three paradigms for processing large graphs
Proceedings of the fifth ACM international conference on Web search and data mining
Compact rich-functional binary relation representations
LATIN'10 Proceedings of the 9th Latin American conference on Theoretical Informatics
Robust disambiguation of named entities in text
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Extended compact web graph representations
Algorithms and Applications
Query preserving graph compression
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Towards effective partition management for large graphs
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Shortest-path queries for complex networks: exploiting low tree-width outside the core
Proceedings of the 15th International Conference on Extending Database Technology
Parallel and I/O efficient set covering algorithms
Proceedings of the twenty-fourth annual ACM symposium on Parallelism in algorithms and architectures
Practical acceleration for computing the HITS ExpertRank vectors
Journal of Computational and Applied Mathematics
Vertex neighborhoods, low conductance cuts, and good seeds for local community methods
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Optimizing positional index structures for versioned document collections
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Spinning fast iterative data flows
Proceedings of the VLDB Endowment
Network compression by node and edge mergers
Bisociative Knowledge Discovery
On computing the diameter of real-world directed (weighted) graphs
SEA'12 Proceedings of the 11th international conference on Experimental Algorithms
Computing strong articulation points and strong bridges in large scale graphs
SEA'12 Proceedings of the 11th international conference on Experimental Algorithms
Quick detection of nodes with large degrees
WAW'12 Proceedings of the 9th international conference on Algorithms and Models for the Web Graph
Proceedings of the 3rd Annual ACM Web Science Conference
gbase: an efficient analysis platform for large graphs
The VLDB Journal — The International Journal on Very Large Data Bases
PowerGraph: distributed graph-parallel computation on natural graphs
OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
GraphChi: large-scale graph computation on just a PC
OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
Direction-optimizing breadth-first search
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Breaking the speed and scalability barriers for graph exploration on distributed-memory machines
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
NUMA-aware graph mining techniques for performance and energy efficiency
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Mining query log graphs towards a query folksonomy
Concurrency and Computation: Practice & Experience
A hybrid approach for efficient provenance storage
Proceedings of the 21st ACM international conference on Information and knowledge management
Compressed representation of web and social networks via dense subgraphs
SPIRE'12 Proceedings of the 19th international conference on String Processing and Information Retrieval
Regularization-based solution of the PageRank problem for large matrices
Automation and Remote Control
Acolyte: an in-memory social network query system
WISE'12 Proceedings of the 13th international conference on Web Information Systems Engineering
Detecting Webspam Beneficiaries Using Information Collected by the Random Surfer
International Journal of Organizational and Collective Intelligence
Four Degrees of Separation, Really
ASONAM '12 Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012)
Binary RDF representation for publication and exchange (HDT)
Web Semantics: Science, Services and Agents on the World Wide Web
Fast exact shortest-path distance queries on large networks by pruned landmark labeling
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
GPS: a graph processing system
Proceedings of the 25th International Conference on Scientific and Statistical Database Management
Scalable all-pairs similarity search in metric spaces
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Evaluation of a Hybrid Approach for Efficient Provenance Storage
ACM Transactions on Storage (TOS)
PAGE: a partition aware graph computation engine
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
"All roads lead to Rome": optimistic recovery for distributed iterative data processing
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Compact representation of Web graphs with extended functionality
Information Systems
Compact binary relation representations with rich functionality
Information and Computation
Tight and simple Web graph compression for forward and reverse neighbor queries
Discrete Applied Mathematics
On computing the diameter of real-world undirected graphs
Theoretical Computer Science
(Nearly-)tight bounds on the contiguity and linearity of cographs
Theoretical Computer Science
Direction-optimizing breadth-first search
Scientific Programming - Selected Papers from Super Computing 2012
Hi-index | 0.00 |
Studying web graphs is often difficult due to their large size. Recently,several proposals have been published about various techniques that allow tostore a web graph in memory in a limited space, exploiting the inner redundancies of the web. The WebGraph framework is a suite of codes, algorithms and tools that aims at making it easy to manipulate large web graphs. This papers presents the compression techniques used in WebGraph, which are centred around referentiation and intervalisation (which in turn are dual to each other). WebGraph can compress the WebBase graph (118 Mnodes, 1 Glinks)in as little as 3.08 bits per link, and its transposed version in as littleas 2.89 bits per link.