Detecting Anomalies in Bipartite Graphs with Mutual Dependency Principles

Authors:
Hanbo Dai;Feida Zhu;Ee-Peng Lim;HweeHwa Pang
Affiliations:
-;-;-;-
Venue:
ICDM '12 Proceedings of the 2012 IEEE 12th International Conference on Data Mining
Year:
2012

Citing 0
Cited 2

A probabilistic graphical model for brand reputation assessment in social networks

Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
Learning an accurate entity resolution model from crowdsourced labels

Proceedings of the 8th International Conference on Ubiquitous Information Management and Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

Bipartite graphs can model many real life applications including users-rating-products in online marketplaces, users-clicking-webpages on the World Wide Web and users referring- users in social networks. In these graphs, the anomalousness of nodes in one partite often depends on that of their connected nodes in the other partite. Previous studies have shown that this dependency can be positive (the anomalousness of a node in one partite increases or decreases along with that of its connected nodes in the other partite) or negative (the anomalousness of a node in one partite rises or falls in opposite direction to that of its connected nodes in the other partite). In this paper, we unify both positive and negative mutual dependency relationships in an unsupervised framework for detecting anomalous nodes in bipartite graphs. This is the first work that integrates both mutual dependency principles to model the complete set of anomalous behaviors of nodes that cannot be identified by either principle alone. We formulate our principles and design an iterative algorithm to simultaneously compute the anomaly scores of nodes in both partites. Moreover, we mathematically prove that the ranking of nodes by anomaly scores in each partite converges. Our framework is examined on synthetic graphs and the results show that our model outperforms existing models with only positive or negative mutual dependency principles. We also apply our framework to two real life datasets: Goodreads as a users-rating-books setting and Buzzcity as a users-clicking advertisements setting. The results show that our method is able to detect suspected spamming users and spammed books in Goodreads and achieve higher precision in identifying fraudulent advertisement publishers than existing approaches.