Does “authority” mean quality? predicting expert quality ratings of Web documents

Authors:
Brian Amento;Loren Terveen;Will Hill
Affiliations:
AT&T Shannon Laboratories, 180 Park Avenue, Florharn Park, NJ and Department of Computer Science, Virginia Tech.;AT&T Shannon Laboratories, 180 Park Avenue, Florharn Park, NJ;AT&T Shannon Laboratories, 180 Park Avenue, Florharn Park, NJ
Venue:
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Year:
2000

Citing 12
Cited 84

The WebBook and the Web Forager: an information workspace for the World-Wide Web

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Silk from a sow's ear: extracting usable structures from the Web

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Scatter/gather browsing communicates the topic structure of a very large text collection

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
SenseMaker: an information-exploration interface supporting the contextual evolution of a user's interests

Proceedings of the ACM SIGCHI Conference on Human factors in computing systems
Life, death, and lawfulness on the electronic frontier

Proceedings of the ACM SIGCHI Conference on Human factors in computing systems
Improved algorithms for topic distillation in a hyperlinked environment

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Automatic resource compilation by analyzing hyperlink structure and associated text

WWW7 Proceedings of the seventh international conference on World Wide Web 7
An empirical evaluation of user interfaces for topic management of Web sites

Proceedings of the SIGCHI conference on Human Factors in Computing Systems
Constructing, organizing, and visualizing collections of topically related Web resources

ACM Transactions on Computer-Human Interaction (TOCHI)
Authoritative sources in a hyperlinked environment

Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Implementation of the SMART Information Retrieval System

Implementation of the SMART Information Retrieval System
User interfaces for topic management of web sites

User interfaces for topic management of web sites

TopicShop: enhanced support for evaluating and organizing collections of Web sites

UIST '00 Proceedings of the 13th annual ACM symposium on User interface software and technology
Searching the Web

ACM Transactions on Internet Technology (TOIT)
Evaluating topic-driven web crawlers

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Effective site finding using link anchor information

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Stable algorithms for link analysis

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Judgement of information quality and cognitive authority in the Web

Journal of the American Society for Information Science and Technology
Web page scoring systems for horizontal and vertical search

Proceedings of the 11th international conference on World Wide Web
The Importance of Prior Probabilities for Entry Page Search

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Topic-oriented collaborative crawling

Proceedings of the eleventh international conference on Information and knowledge management
Entropy-based link analysis for mining web informative structures

Proceedings of the eleventh international conference on Information and knowledge management
Features of documents relevant to task- and fact- oriented questions

Proceedings of the eleventh international conference on Information and knowledge management
Text Retrieval Systems for the Web

Programming and Computing Software
Experiments in social data mining: The TopicShop system

ACM Transactions on Computer-Human Interaction (TOCHI)
Hyperlink Analysis for the Web

IEEE Internet Computing
A Method of Improving Feature Vector for Web Pages Reflecting the Contents of Their Out-Linked Pages

DEXA '02 Proceedings of the 13th International Conference on Database and Expert Systems Applications
Re-ranking search results using network analysis a case study with google: a case study with Google

CASCON '02 Proceedings of the 2002 conference of the Centre for Advanced Studies on Collaborative research
Query-independent evidence in home page finding

ACM Transactions on Information Systems (TOIS)
Searching the hypermedia web: improved topic distillation through network analytic relevance ranking

The New Review of Hypermedia and Multimedia - Hypermedia and the world wide web
A Unified Probabilistic Framework for Web Page Scoring Systems

IEEE Transactions on Knowledge and Data Engineering
Mining Web Informative Structures and Contents Based on Entropy Analysis

IEEE Transactions on Knowledge and Data Engineering
Web unit mining: finding and classifying subgraphs of web pages

CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Replicating Web Structure in Small-Scale Test Collections

Information Retrieval
Automatic topics discovery from hyperlinked documents

Information Processing and Management: an International Journal
Indicators of accuracy for answers to ready reference questions on the internet

Journal of the American Society for Information Science and Technology
Block-level link analysis

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Recommender Systems Research: A Connection-Centric Survey

Journal of Intelligent Information Systems
Effect of different network analysis strategies on search engine re-ranking

CASCON '04 Proceedings of the 2004 conference of the Centre for Advanced Studies on Collaborative research
Combining evidence for Web retrieval using the inference network model: an experimental study

Information Processing and Management: an International Journal - Special issue: Bayesian networks and information retrieval
Dempster-Shafer Theory for a Query-Biased Combination of Evidence on the Web

Information Retrieval
Rank-Stability and Rank-Similarity of Link-Based Web Ranking Algorithms in Authority-Connected Graphs

Information Retrieval
A General Evaluation Framework for Topical Crawlers

Information Retrieval
Organizing encyclopedic knowledge based on the web and its application to question answering

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Automatically predicting information quality in news documents

NAACL-Short '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: companion volume of the Proceedings of HLT-NAACL 2003--short papers - Volume 2
A study of relevance propagation for web search

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Relevance weighting for query independent evidence

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Hyperlink analysis on the world wide web

Proceedings of the sixteenth ACM conference on Hypertext and hypermedia
Clustering-Based Visual Interfaces for Presentation of Web Search Results: An Empirical Investigation

Information Systems Frontiers
Question answering using encyclopedic knowledge generated from the web

ODQA '01 Proceedings of the workshop on Open-domain question answering - Volume 12
Beyond PageRank: machine learning for static ranking

Proceedings of the 15th international conference on World Wide Web
Implementation and evaluation of a quality-based search engine

Proceedings of the seventeenth conference on Hypertext and hypermedia
The quest to find the best pages on the web

Information Services and Use
Measuring Qualities of Articles Contributed by Online Communities

WI '06 Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence
On rank correlation in information retrieval evaluation

ACM SIGIR Forum
Hits on the web: how does it compare?

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
On improving wikipedia search using article quality

Proceedings of the 9th annual ACM international workshop on Web information and data management
Comparing the effectiveness of hits and salsa

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
A predictive framework for retrieving the best answer

Proceedings of the 2008 ACM symposium on Applied computing
An outranking approach for information retrieval

Information Retrieval
BrowseRank: letting web users vote for page importance

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Approximating PageRank from In-Degree

Algorithms and Models for the Web-Graph
Personalized ranking for digital libraries based on log analysis

Proceedings of the 10th ACM workshop on Web information and data management
A "quick and dirty" website data quality indicator

Proceedings of the 2nd ACM workshop on Information credibility on the web
Is Wikipedia link structure different?

Proceedings of the Second ACM International Conference on Web Search and Data Mining
Quality Evaluation of Search Results by Typicality and Speciality of Terms Extracted from Wikipedia

DASFAA '09 Proceedings of the 14th International Conference on Database Systems for Advanced Applications
Refining search results using a mining framework

Expert Systems with Applications: An International Journal
Correlation of music charts and search engine rankings

Proceedings of the 9th ACM/IEEE-CS joint conference on Digital libraries
Comparing the performance of us college football teams in the web and on the field

Proceedings of the 20th ACM conference on Hypertext and hypermedia
Weighted Rank Correlation in Information Retrieval Evaluation

AIRS '09 Proceedings of the 5th Asia Information Retrieval Symposium on Information Retrieval Technology
An Exploratory Study on Using Social Information Networks for Flexible Literature Access

FQAS '09 Proceedings of the 8th International Conference on Flexible Query Answering Systems
A framework to compute page importance based on user behaviors

Information Retrieval
Improving the evaluation of web search systems

ECIR'03 Proceedings of the 25th European conference on IR research
Improving the evaluation of web search systems

ECIR'03 Proceedings of the 25th European conference on IR research
Analyzing and visualizing gray web forum structure

PAISI'07 Proceedings of the 2007 Pacific Asia conference on Intelligence and security informatics
Modeling the web as a hypergraph to compute page reputation

Information Systems
Web document modeling

The adaptive web
The Austrian way of Wiki(pedia)!: development of a structured Wiki-based encyclopedia within a local Austrian context

Proceedings of the 6th International Symposium on Wikis and Open Collaboration
Predicting Web Page Status

Information Systems Research
The importance of anchor text for ad hoc search revisited

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
A collaborative approach to build evaluated web page datasets

Future Generation Computer Systems
Relevance propagation model for large hypertext document collections

Large Scale Semantic Access to Content (Text, Image, Video, and Sound)
Using prior information derived from citations in literature search

Large Scale Semantic Access to Content (Text, Image, Video, and Sound)
Topic Distillation with Query-Dependent Link Connections and Page Characteristics

ACM Transactions on the Web (TWEB)
Characterizing the uncertainty of web data: models and experiences

Proceedings of the 2011 Joint WICOW/AIRWeb Workshop on Web Quality
Ranking search results by web quality dimensions

Journal of Web Engineering
Local computation of PageRank: the ranking side

Proceedings of the 20th ACM international conference on Information and knowledge management
Adaptive ranking of search results by considering user's comprehension

Proceedings of the 4th International Conference on Uniquitous Information Management and Communication
On the problem of identifying the quality of geographic metadata

ECDL'06 Proceedings of the 10th European conference on Research and Advanced Technology for Digital Libraries
Mining communities on the web using a max-flow and a site-oriented framework

WISE'05 Proceedings of the 6th international conference on Web Information Systems Engineering
Exploring cost-effective approaches to human evaluation of search engine relevance

ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research
A path-based approach for web page retrieval

World Wide Web
Index ordering by query-independent measures

Information Processing and Management: an International Journal
Constructing a reliable Web graph with information on browsing behavior

Decision Support Systems
Web data reconciliation: models and experiences

Search Computing
How do metrics of link analysis correlate to quality, relevance and popularity in wikipedia?

Proceedings of the 19th Brazilian symposium on Multimedia and the web

Quantified Score

Hi-index	0.01

Visualization

Abstract

For many topics, the World Wide Web contains hundreds or thousands of relevant documents of widely varying quality. Users face a daunting challenge in identifying a small subset of documents worthy of their attention.Link analysis algorithms have received much interest recently, in large part for their potential to identify high quality items. We report here on an experimental evaluation of this potential.We evaluated a number of link and content-based algorithms using a dataset of web documents rated for quality by human topic experts. Link-based metrics did a good job of picking out high-quality items. Precision at 5 is about 0.75, and precision at 10 is about 0.55; this is in a dataset where 0.32 of all documents were of high quality. Surprisingly, a simple content-based metric performed nearly as well; ranking documents by the total number of pages on their containing site.