Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
Machine Learning
Web site mining: a new way to spot competitors, customers and suppliers in the world wide web
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
HelpfulMed: intelligent searching for medical information over the internet
Journal of the American Society for Information Science and Technology
Spam, damn spam, and statistics: using statistical analysis to locate spam web pages
Proceedings of the 7th International Workshop on the Web and Databases: colocated with ACM SIGMOD/PODS 2004
Identifying link farm spam pages
WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
IEEE Transactions on Knowledge and Data Engineering
An Antiphishing Strategy Based on Visual Similarity Assessment
IEEE Internet Computing
Detecting spam web pages through content analysis
Proceedings of the 15th international conference on World Wide Web
Detecting semantic cloaking on the web
Proceedings of the 15th international conference on World Wide Web
Link spam detection based on mass estimation
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Why spoofing is serious internet fraud
Communications of the ACM
Exploring both Content and Link Quality for Anti-Spamming
CIT '06 Proceedings of the Sixth IEEE International Conference on Computer and Information Technology
Detecting Phishing Web Pages with Visual Similarity Assessment Based on Earth Mover's Distance (EMD)
IEEE Transactions on Dependable and Secure Computing
Detecting Link Spam Using Temporal Information
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Decision Support Systems
A cautious surfer for PageRank
Proceedings of the 16th international conference on World Wide Web
Splog detection using self-similarity analysis on blog temporal dynamics
AIRWeb '07 Proceedings of the 3rd international workshop on Adversarial information retrieval on the web
Improving web spam classifiers using link structure
AIRWeb '07 Proceedings of the 3rd international workshop on Adversarial information retrieval on the web
Extracting link spam using biased random walks from spam seed sets
AIRWeb '07 Proceedings of the 3rd international workshop on Adversarial information retrieval on the web
Know your neighbors: web spam detection using the web topology
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Winnowing wheat from the chaff: propagating trust to sift spam from the web
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
ServiceFinder: A method towards enhancing service portals
ACM Transactions on Information Systems (TOIS)
Combating web spam with trustrank
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Link analysis for Web spam detection
ACM Transactions on the Web (TWEB)
Tracking Web spam with HTML style similarities
ACM Transactions on the Web (TWEB)
DirichletRank: Solving the zero-one gap problem of PageRank
ACM Transactions on Information Systems (TOIS)
Web spam identification through content and hyperlinks
AIRWeb '08 Proceedings of the 4th international workshop on Adversarial information retrieval on the web
MedSearch: a specialized search engine for medical information retrieval
Proceedings of the 17th ACM conference on Information and knowledge management
Dynamics of Trust Revision: Using Health Infomediaries
Journal of Management Information Systems
Metalearning: Applications to Data Mining
Metalearning: Applications to Data Mining
Web spam identification through language model analysis
Proceedings of the 5th International Workshop on Adversarial Information Retrieval on the Web
Link spam target detection using page farms
ACM Transactions on Knowledge Discovery from Data (TKDD)
Feature subsumption for opinion analysis
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
A comparison of fraud cues and classification methods for fake escrow website detection
Information Technology and Management
Weblog classification for fast splog filtering: a URL language model segmentation approach
NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
Cyberchondria: Studies of the escalation of medical concerns in Web search
ACM Transactions on Information Systems (TOIS)
Web spam detection: new classification features based on qualified link analysis and language models
IEEE Transactions on Information Forensics and Security
Social Participation in Health 2.0
Computer
Thwarting the nigritude ultramarine: learning to identify link spam
ECML'05 Proceedings of the 16th European conference on Machine Learning
IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans
Reliability prediction of webpages in the medical domain
ECIR'12 Proceedings of the 34th European conference on Advances in Information Retrieval
Trust ranking of medical websites
Proceedings of the 4th ACM conference on Data and application security and privacy
Hi-index | 0.00 |
Fake medical Web sites have become increasingly prevalent. Consequently, much of the health-related information and advice available online is inaccurate and/or misleading. Scores of medical institution Web sites are for organizations that do not exist and more than 90% of online pharmacy Web sites are fraudulent. In addition to monetary losses exacted on unsuspecting users, these fake medical Web sites have severe public safety ramifications. According to a World Health Organization report, approximately half the drugs sold on the Web are counterfeit, resulting in thousands of deaths. In this study, we propose an adaptive learning algorithm called recursive trust labeling (RTL). RTL uses underlying content and graph-based classifiers, coupled with a recursive labeling mechanism, for enhanced detection of fake medical Web sites. The proposed method was evaluated on a test bed encompassing nearly 100 million links between 930,000 Web sites, including 1,000 known legitimate and fake medical sites. The experimental results revealed that RTL was able to significantly improve fake medical Web site detection performance over 19 comparison content and graph-based methods, various meta-learning techniques, and existing adaptive learning approaches, with an overall accuracy of over 94%. Moreover, RTL was able to attain high performance levels even when the training dataset composed of as little as 30 Web sites. With the increased popularity of eHealth and Health 2.0, the results have important implications for online trust, security, and public safety.