Adaptation in natural and artificial systems
Adaptation in natural and artificial systems
A language modeling approach to information retrieval
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Algorithms for Mining Distance-Based Outliers in Large Datasets
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Spam, damn spam, and statistics: using statistical analysis to locate spam web pages
Proceedings of the 7th International Workshop on the Web and Databases: colocated with ACM SIGMOD/PODS 2004
A vertical distance-based outlier detection method with local pruning
Proceedings of the thirteenth ACM international conference on Information and knowledge management
Identifying link farm spam pages
WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
PageRank without hyperlinks: structural re-ranking using links induced by language models
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Graphs over time: densification laws, shrinking diameters and possible explanations
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Detecting spam web pages through content analysis
Proceedings of the 15th international conference on World Wide Web
Respect my authority!: HITS without hyperlinks, utilizing cluster-based language models
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Link spam detection based on mass estimation
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
BlogRank: ranking weblogs based on connectivity and similarity features
AAA-IDEA '06 Proceedings of the 2nd international workshop on Advanced architectures and algorithms for internet delivery and applications
Detecting Link Spam Using Temporal Information
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Spam double-funnel: connecting web spammers with advertisers
Proceedings of the 16th international conference on World Wide Web
Improving web spam classification using rank-time features
AIRWeb '07 Proceedings of the 3rd international workshop on Adversarial information retrieval on the web
Know your neighbors: web spam detection using the web topology
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
DiffusionRank: a possible penicillin for web spamming
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Combating web spam with trustrank
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Tracking Web spam with HTML style similarities
ACM Transactions on the Web (TWEB)
Detecting splogs via temporal dynamics using self-similarity analysis
ACM Transactions on the Web (TWEB)
Searching blogs and news: a study on popular queries
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Combating spam in tagging systems: An evaluation
ACM Transactions on the Web (TWEB)
Blogosphere: research issues, tools, and applications
ACM SIGKDD Explorations Newsletter
Identifying web spam with user behavior analysis
AIRWeb '08 Proceedings of the 4th international workshop on Adversarial information retrieval on the web
Analysing features of Japanese splogs and characteristics of keywords
AIRWeb '08 Proceedings of the 4th international workshop on Adversarial information retrieval on the web
An empirical study on selective sampling in active learning for splog detection
Proceedings of the 5th International Workshop on Adversarial Information Retrieval on the Web
Proceedings of the 5th International Workshop on Adversarial Information Retrieval on the Web
Detecting spammers and content promoters in online video social networks
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Detecting spam blogs: a machine learning approach
AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Weblog classification for fast splog filtering: a URL language model segmentation approach
NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
Hi-index | 0.00 |
Blogging has been an emerging media for people to express themselves. However, the presence of spam blogs (also known as splogs) may reduce the value of blogs and blog search engines. Hence, splog detection has recently attracted much attention from research. Most existing works on splog detection identify splogs using their content/link features and target on spam filters protecting blog search engines' index from spam. In this paper, we propose a splog detection framework by monitoring the on-line search results. The novelty of our splog detection is that our detection capitalizes on the results returned by search engines. The proposed method therefore is particularly useful in detecting those splogs that have successfully slipped through the spam filters that are also actively generating spam-posts. More specifically, our method monitors the top-ranked results of a sequence of temporally-ordered queries and detects splogs based on blogs' temporal behavior. The temporal behavior of a blog is maintained in a blog profile. Given blog profiles, splog detecting functions have been proposed and evaluated using real data collected from a popular blog search engine. Our experiments have demonstrated that splogs could be detected with high accuracy. The proposed method can be implemented on top of any existing blog search engine without intrusion to the latter.