The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Mercator: A scalable, extensible Web crawler
World Wide Web
Design and Implementation of a High-Performance Distributed Web Crawler
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
UbiCrawler: a scalable fully distributed web crawler
Software—Practice & Experience
Proceedings of the 16th international conference on World Wide Web
Measurement and analysis of online social networks
Proceedings of the 7th ACM SIGCOMM conference on Internet measurement
Towards identity anonymization on graphs
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Proceedings of the first workshop on Online social networks
Link privacy in social networks
Proceedings of the 17th ACM conference on Information and knowledge management
Preserving Privacy in Social Networks Against Neighborhood Attacks
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
De-anonymizing Social Networks
SP '09 Proceedings of the 2009 30th IEEE Symposium on Security and Privacy
The social honeypot project: protecting online communities from spammers
Proceedings of the 19th international conference on World wide web
Preserving the privacy of sensitive relationships in graph data
PinKDD'07 Proceedings of the 1st ACM SIGKDD international conference on Privacy, security, and trust in KDD
APWEB '10 Proceedings of the 2010 12th International Asia-Pacific Web Conference
OSN: When Multiple Autonomous Users Disclose Another Individual's Information
3PGCIC '10 Proceedings of the 2010 International Conference on P2P, Parallel, Grid, Cloud and Internet Computing
On the protection of social networks user's information
Knowledge-Based Systems
Hi-index | 0.00 |
Web crawlers are complex applications that explore the Web with different purposes. Web crawlers can be configured to crawl online social networks (OSN) to obtain relevant data about its global structure. Before a web crawler can be launched to explore the web, a large amount of settings have to be configured. This settings define the behavior of the crawler and have a big impact on the collected data. The amount of collected data and the quality of the information that it contains are affected by the crawler settings and, therefore, by properly configuring this web crawler settings we can target specific goals to achieve with our crawl. In this paper, we analyze how different scheduler algorithms affect to the collected data in terms of users' privacy. Furthermore, we introduce the concept of online social honeynet (OShN) to protect OSN from web crawlers and we provide an OShN proof-of-concept that achieve good results for protecting OSN from a specific web crawler.