The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Authoritative sources in a hyperlinked environment
Journal of the ACM (JACM)
Analysis of a very large web search engine query log
ACM SIGIR Forum
Text Classification from Labeled and Unlabeled Documents using EM
Machine Learning - Special issue on information retrieval
Effective site finding using link anchor information
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
The Philosophy of Information Retrieval Evaluation
CLEF '01 Revised Papers from the Second Workshop of the Cross-Language Evaluation Forum on Evaluation of Cross-Language Information Retrieval Systems
PAC Learning from Positive Statistical Queries
ALT '98 Proceedings of the 9th International Conference on Algorithmic Learning Theory
The connectivity sonar: detecting site functionality by structural patterns
Proceedings of the fourteenth ACM conference on Hypertext and hypermedia
One-class svms for document classification
The Journal of Machine Learning Research
PEBL: Web Page Classification without Negative Examples
IEEE Transactions on Knowledge and Data Engineering
Learning block importance models for web pages
Proceedings of the 13th international conference on World Wide Web
Discovery of ads web hosts through traffic data analysis
Proceedings of the 9th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Spam, damn spam, and statistics: using statistical analysis to locate spam web pages
Proceedings of the 7th International Workshop on the Web and Databases: colocated with ACM SIGMOD/PODS 2004
Detecting spam web pages through content analysis
Proceedings of the 15th international conference on World Wide Web
Improving web search ranking by incorporating user behavior information
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Spam double-funnel: connecting web spammers with advertisers
Proceedings of the 16th international conference on World Wide Web
Improving web spam classification using rank-time features
AIRWeb '07 Proceedings of the 3rd international workshop on Adversarial information retrieval on the web
Boosting the Performance of Web Spam Detection with Ensemble Under-Sampling Classification
FSKD '07 Proceedings of the Fourth International Conference on Fuzzy Systems and Knowledge Discovery - Volume 04
Combating web spam with trustrank
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Mining the search trails of surfing crowds: identifying relevant websites from user activity
Proceedings of the 17th international conference on World Wide Web
Using the wisdom of the crowds for keyword generation
Proceedings of the 17th international conference on World Wide Web
User behavior oriented web spam detection
Proceedings of the 17th international conference on World Wide Web
BrowseRank: letting web users vote for page importance
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Computer
A large-scale study of automated web search traffic
AIRWeb '08 Proceedings of the 4th international workshop on Adversarial information retrieval on the web
Identifying web spam with user behavior analysis
AIRWeb '08 Proceedings of the 4th international workshop on Adversarial information retrieval on the web
Query-log mining for detecting spam
AIRWeb '08 Proceedings of the 4th international workshop on Adversarial information retrieval on the web
Exploring linguistic features for web spam detection: a preliminary study
AIRWeb '08 Proceedings of the 4th international workshop on Adversarial information retrieval on the web
Challenges in web search engines
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Foundations and Trends in Information Retrieval
Efficient and effective spam filtering and re-ranking for large web datasets
Information Retrieval
Constructing a reliable Web graph with information on browsing behavior
Decision Support Systems
Semantic security against web application attacks
Information Sciences: an International Journal
Hi-index | 0.00 |
Combating Web spam has become one of the top challenges for Web search engines. State-of-the-art spam-detection techniques are usually designed for specific, known types of Web spam and are incapable of dealing with newly appearing spam types efficiently. With user-behavior analyses from Web access logs, a spam page-detection algorithm is proposed based on a learning scheme. The main contributions are the following. (1) User-visiting patterns of spam pages are studied, and a number of user-behavior features are proposed for separating Web spam pages from ordinary pages. (2) A novel spam-detection framework is proposed that can detect various kinds of Web spam, including newly appearing ones, with the help of the user-behavior analysis. Experiments on large-scale practical Web access log data show the effectiveness of the proposed features and the detection framework.