The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Authoritative sources in a hyperlinked environment
Journal of the ACM (JACM)
Learning to Create Customized Authority Lists
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Netprobe: a fast and scalable system for fraud detection in online auction networks
Proceedings of the 16th international conference on World Wide Web
Combating web spam with trustrank
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
MapReduce: simplified data processing on large clusters
Communications of the ACM - 50th anniversary issue: 1958 - 2008
Web spam identification through content and hyperlinks
AIRWeb '08 Proceedings of the 4th international workshop on Adversarial information retrieval on the web
Semi-supervised ranking on very large graphs with rich metadata
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Combining file content and file relations for cloud based malware detection
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
A survey on automated dynamic malware-analysis techniques and tools
ACM Computing Surveys (CSUR)
Hi-index | 0.00 |
Search engines are currently facing a problem of websites that distribute malware. In this paper we present a novel efficient algorithm that learns to detect such kind of spam. We have used a bipartite graph with two types of nodes, each representing a layer in the graph: web-sites and file hostings (FH), connected with edges representing the fact that a file can be downloaded from the hosting via a link on the web-site. The performance of this spam detection method was verified using two set of ground truth labels: manual assessments of antivirus analysts and automatically generated assessments obtained from antivirus companies. We demonstrate that the proposed method is able to detect new types of malware even before the best known antivirus solutions are able to detect them.