Proceedings of the 13th international conference on World Wide Web
The webgraph framework I: compression techniques
Proceedings of the 13th international conference on World Wide Web
UbiCrawler: a scalable fully distributed web crawler
Software—Practice & Experience
Communications of the ACM - The disappearing computer
Detecting spam web pages through content analysis
Proceedings of the 15th international conference on World Wide Web
Detecting nepotistic links by language model disagreement
Proceedings of the 15th international conference on World Wide Web
Combating web spam with trustrank
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
A cautious surfer for PageRank
Proceedings of the 16th international conference on World Wide Web
Improving web spam classification using rank-time features
AIRWeb '07 Proceedings of the 3rd international workshop on Adversarial information retrieval on the web
Transductive link spam detection
AIRWeb '07 Proceedings of the 3rd international workshop on Adversarial information retrieval on the web
Web spam detection via commercial intent analysis
AIRWeb '07 Proceedings of the 3rd international workshop on Adversarial information retrieval on the web
Spectral clustering and transductive learning with multiple views
Proceedings of the 24th international conference on Machine learning
Know your neighbors: web spam detection using the web topology
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Winnowing wheat from the chaff: propagating trust to sift spam from the web
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Link analysis for Web spam detection
ACM Transactions on the Web (TWEB)
Tracking Web spam with HTML style similarities
ACM Transactions on the Web (TWEB)
Detecting splogs via temporal dynamics using self-similarity analysis
ACM Transactions on the Web (TWEB)
WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Fourth international workshop on adversarial information retrieval on the web (AIRWeb 2008)
Proceedings of the 17th international conference on World Wide Web
Adversarial Information Retrieval on the Web (AIRWeb 2007)
ACM SIGIR Forum
Efficient semi-streaming algorithms for local triangle counting in massive graphs
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Design trade-offs for search engine caching
ACM Transactions on the Web (TWEB)
Identifying Spam Web Pages Based on Content Similarity
ICCSA '08 Proceedings of the international conference on Computational Science and Its Applications, Part II
Query-log mining for detecting spam
AIRWeb '08 Proceedings of the 4th international workshop on Adversarial information retrieval on the web
Web spam identification through content and hyperlinks
AIRWeb '08 Proceedings of the 4th international workshop on Adversarial information retrieval on the web
Predicting web spam with HTTP session information
Proceedings of the 17th ACM conference on Information and knowledge management
Compressed collections for simulated crawling
ACM SIGIR Forum
Query suggestions using query-flow graphs
Proceedings of the 2009 workshop on Web Search Click Data
A study of link farm distribution and evolution using a time series of web snapshots
Proceedings of the 5th International Workshop on Adversarial Information Retrieval on the Web
Web spam filtering in internet archives
Proceedings of the 5th International Workshop on Adversarial Information Retrieval on the Web
Web spam identification through language model analysis
Proceedings of the 5th International Workshop on Adversarial Information Retrieval on the Web
Nullification test collections for web spam and SEO
Proceedings of the 5th International Workshop on Adversarial Information Retrieval on the Web
Web spam challenge proposal for filtering in archives
Proceedings of the 5th International Workshop on Adversarial Information Retrieval on the Web
Link spam target detection using page farms
ACM Transactions on Knowledge Discovery from Data (TKDD)
Automated opinion detection: Implications of the level of agreement between human raters
Information Processing and Management: an International Journal
The Journal of Machine Learning Research
Modeling the web as a hypergraph to compute page reputation
Information Systems
Using evidence based content trust model for spam detection
Expert Systems with Applications: An International Journal
Efficient algorithms for large-scale local triangle counting
ACM Transactions on Knowledge Discovery from Data (TKDD)
Web spam detection: new classification features based on qualified link analysis and language models
IEEE Transactions on Information Forensics and Security
Spam detection with a content-based random-walk algorithm
SMUC '10 Proceedings of the 2nd international workshop on Search and mining user-generated contents
Learning to detect web spam by genetic programming
WAIM'10 Proceedings of the 11th international conference on Web-age information management
Web spam classification: a few features worth more
Proceedings of the 2011 Joint WICOW/AIRWeb Workshop on Web Quality
Spam detection in online classified advertisements
Proceedings of the 2011 Joint WICOW/AIRWeb Workshop on Web Quality
Foundations and Trends in Information Retrieval
Query reformulation mining: models, patterns, and applications
Information Retrieval
Detecting malicious web links and identifying their attack types
WebApps'11 Proceedings of the 2nd USENIX conference on Web application development
Semi-supervised ranking on very large graphs with rich metadata
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Combining textual content and hyperlinks in web spam detection
NLDB'11 Proceedings of the 16th international conference on Natural language processing and information systems
Web Spam Detection by Exploring Densely Connected Subgraphs
WI-IAT '11 Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01
Updating broken web links: An automatic recommendation system
Information Processing and Management: an International Journal
Content-based trust and bias classification via biclustering
Proceedings of the 2nd Joint WICOW/AIRWeb Workshop on Web Quality
Spotting fake reviewer groups in consumer reviews
Proceedings of the 21st international conference on World Wide Web
Survey on web spam detection: principles and algorithms
ACM SIGKDD Explorations Newsletter
Adversarial content manipulation effects
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Diversionary comments under political blog posts
Proceedings of the 21st ACM international conference on Information and knowledge management
Using site-level connections to estimate link confidence
Journal of the American Society for Information Science and Technology
A Self-Supervised Approach to Comment Spam Detection Based on Content Analysis
International Journal of Information Security and Privacy
Shame to be sham: addressing content-based grey hat search engine optimization
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Spotting opinion spammers using behavioral footprints
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Cross-lingual web spam classification
Proceedings of the 22nd international conference on World Wide Web companion
A study of manipulative and authentic negative reviews
Proceedings of the 8th International Conference on Ubiquitous Information Management and Communication
The whens and hows of learning to rank for web search
Information Retrieval
Hi-index | 0.00 |
We describe the WEBSPAM-UK2006 collection, a large set of Web pages that have been manually annotated with labels indicating if the hosts are include Web spam aspects or not. This is the first publicly available Web spam collection that includes page contents and links, and that has been labelled by a large and diverse set of judges.