Proceedings of the fifth international World Wide Web conference on Computer networks and ISDN systems
Improved algorithms for topic distillation in a hyperlinked environment
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Authoritative sources in a hyperlinked environment
Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Does “authority” mean quality? predicting expert quality ratings of Web documents
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
SALSA: the stochastic approach for link-structure analysis
ACM Transactions on Information Systems (TOIS)
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Local versus global link information in the Web
ACM Transactions on Information Systems (TOIS)
Who Links to Whom: Mining Linkage between Web Sites
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Engineering a multi-purpose test collection for web retrieval experiments
Information Processing and Management: an International Journal
Replicating Web Structure in Small-Scale Test Collections
Information Retrieval
Link analysis ranking: algorithms, theory, and experiments
ACM Transactions on Internet Technology (TOIT)
Identifying link farm spam pages
WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
Information retrieval system evaluation: effort, sensitivity, and reliability
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Exploiting the hierarchical structure for link analysis
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Relevance weighting for query independent evidence
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Site level noise removal for search engines
Proceedings of the 15th international conference on World Wide Web
A reference collection for web spam
ACM SIGIR Forum
Graph-Theoretical Methods for Detecting and Describing Gestalt Clusters
IEEE Transactions on Computers
Recommendation of similar users, resources and social networks in a Social Internetworking Scenario
Information Sciences: an International Journal
A new approach for verifying URL uniqueness in web crawlers
SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
An evolutionary factor analysis computation for mining website structures
Expert Systems with Applications: An International Journal
An adaptive learning automata-based ranking function discovery algorithm
Journal of Intelligent Information Systems
An adaptive learning to rank algorithm: Learning automata approach
Decision Support Systems
Using site-level connections to estimate link confidence
Journal of the American Society for Information Science and Technology
How do metrics of link analysis correlate to quality, relevance and popularity in wikipedia?
Proceedings of the 19th Brazilian symposium on Multimedia and the web
Hi-index | 0.00 |
In this work we propose a model to represent the web as a directed hypergraph (instead of a graph), where links connect pairs of disjointed sets of pages. The web hypergraph is derived from the web graph by dividing the set of pages into non-overlapping blocks and using the links between pages of distinct blocks to create hyperarcs. A hyperarc connects a block of pages to a single page, in order to provide more reliable information for link analysis. We use the hypergraph model to create the hypergraph versions of the Pagerank and Indegree algorithms, referred to as HyperPagerank and HyperIndegree, respectively. The hypergraph is derived from the web graph by grouping pages by two different partition criteria: grouping together the pages that belong to the same web host or to the same web domain. We compared the original page-based algorithms with the host-based and domain-based versions of the algorithms, considering a combination of the page reputation, the textual content of the pages and the anchor text. Experimental results using three distinct web collections show that the HyperPagerank and HyperIndegree algorithms may yield better results than the original graph versions of the Pagerank and Indegree algorithms. We also show that the hypergraph versions of the algorithms were slightly less affected by noise links and spamming.