Reliability prediction of webpages in the medical domain

Authors:
Parikshit Sondhi;V. G. Vinod Vydiswaran;Cheng Xiang Zhai
Affiliations:
Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL;Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL;Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL
Venue:
ECIR'12 Proceedings of the 34th European conference on Advances in Information Retrieval
Year:
2012

Citing 13
Cited 6

The nature of statistical learning theory

The nature of statistical learning theory
The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Making large-scale support vector machine learning practical

Advances in kernel methods
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

ECML '98 Proceedings of the 10th European Conference on Machine Learning
Link analysis ranking: algorithms, theory, and experiments

ACM Transactions on Internet Technology (TOIT)
Data Mining: Concepts and Techniques

Data Mining: Concepts and Techniques
Quality and relevance of domain-specific search: A case study in mental health

Information Retrieval
Reliability and verification of natural language text on the world wide web

Reliability and verification of natural language text on the world wide web
Exploring both Content and Link Quality for Anti-Spamming

CIT '06 Proceedings of the Sixth IEEE International Conference on Computer and Information Technology
Link analysis for Web spam detection

ACM Transactions on the Web (TWEB)
Automatic Retrieval of Web Pages with Standards of Ethics and Trustworthiness Within a Medical Portal: What a Page Name Tells Us

AIME '07 Proceedings of the 11th conference on Artificial Intelligence in Medicine
Robust PageRank and locally computable spam detection features

AIRWeb '08 Proceedings of the 4th international workshop on Adversarial information retrieval on the web
Content-driven trust propagation framework

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining

EmSe: initial evaluation of a child-friendly medical search system

Proceedings of the 4th Information Interaction in Context Symposium
Detecting Fake Medical Web Sites Using Recursive Trust Labeling

ACM Transactions on Information Systems (TOIS)
Web credibility: features exploration and credibility prediction

ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
CredibleWeb: a platform for web credibility evaluation

CHI '13 Extended Abstracts on Human Factors in Computing Systems
Leveraging conceptual lexicon: query disambiguation using proximity information for patent retrieval

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
On the subjectivity and bias of web content credibility evaluations

Proceedings of the 22nd international conference on World Wide Web companion

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we study how to automatically predict reliability of web pages in the medical domain. Assessing reliability of online medical information is especially critical as it may potentially influence vulnerable patients seeking help online. Unfortunately, there are no automated systems currently available that can classify a medical webpage as being reliable, while manual assessment cannot scale up to process the large number of medical pages on the Web. We propose a supervised learning approach to automatically predict reliability of medical webpages. We developed a gold standard dataset using the standard reliability criteria defined by the Health on Net Foundation and systematically experimented with different link and content based feature sets. Our experiments show promising results with prediction accuracies of over 80%. We also show that our proposed prediction method is useful in applications such as reliability-based re-ranking and automatic website accreditation.