SpeedTracer: a Web usage mining and analysis tool
IBM Systems Journal
Optimizing search engines using clickthrough data
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Using terminological feedback for web search refinement: a log-based study
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Spam, damn spam, and statistics: using statistical analysis to locate spam web pages
Proceedings of the 7th International Workshop on the Web and Databases: colocated with ACM SIGMOD/PODS 2004
Accurately interpreting clickthrough data as implicit feedback
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
A large scale study of wireless search behavior: Google mobile search
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Learning user interaction models for predicting web search result preferences
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Wide-scale botnet detection and characterization
HotBots'07 Proceedings of the first conference on First Workshop on Hot Topics in Understanding Botnets
HotBots'07 Proceedings of the first conference on First Workshop on Hot Topics in Understanding Botnets
Is a bot at the controls?: Detecting input data attacks
Proceedings of the 6th ACM SIGCOMM workshop on Network and system support for games
Nullification test collections for web spam and SEO
Proceedings of the 5th International Workshop on Adversarial Information Retrieval on the Web
SBotMiner: large scale search bot detection
Proceedings of the third ACM international conference on Web search and data mining
Large-scale bot detection for search engines
Proceedings of the 19th international conference on World wide web
Using related queries to improve web search results ranking
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
Searching the searchers with searchaudit
USENIX Security'10 Proceedings of the 19th USENIX conference on Security
Foundations and Trends in Information Retrieval
What's clicking what? techniques and innovations of today's clickbots
DIMVA'11 Proceedings of the 8th international conference on Detection of intrusions and malware, and vulnerability assessment
Identifying Web Spam with the Wisdom of the Crowds
ACM Transactions on the Web (TWEB)
Hi-index | 0.00 |
As web search providers seek to improve both relevance and response times, they are challenged by the ever-increasing tax of automated search query traffic. Third party systems interact with search engines for a variety of reasons, such as monitoring a website's rank, augmenting online games, or possibly to maliciously alter click-through rates. In this paper, we investigate automated traffic in the query stream of a large search engine provider. We define automated traffic as any search query not generated by a human in real time. We first provide examples of different categories of query logs generated by bots. We then develop many different features that distinguish between queries generated by people searching for information, and those generated by automated processes. We categorize these features into two classes, either an interpretation of the physical model of human interactions, or as behavioral patterns of automated interactions. We believe these features formulate a basis for a production-level query stream classifier.