Loose tweets: an analysis of privacy leaks on twitter

Authors:
Huina Mao;Xin Shuai;Apu Kapadia
Affiliations:
Indiana University Bloomington, Bloomington, IN, USA;Indiana University Bloomington, Bloomington, IN, USA;Indiana University Bloomington, Bloomington, IN, USA
Venue:
Proceedings of the 10th annual ACM workshop on Privacy in the electronic society
Year:
2011

Citing 8
Cited 4

Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval

ECML '98 Proceedings of the 10th European Conference on Machine Learning
Yes, there is a correlation: - from social networks to personal behavior on the web

Proceedings of the 17th international conference on World Wide Web
Prediction promotes privacy in dynamic social networks

WOSN'10 Proceedings of the 3rd conference on Online social networks
You are where you tweet: a content-based approach to geo-locating twitter users

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Detecting and characterizing social spam campaigns

IMC '10 Proceedings of the 10th ACM SIGCOMM conference on Internet measurement
Data Leak Prevention through Named Entity Recognition

SOCIALCOM '10 Proceedings of the 2010 IEEE Second International Conference on Social Computing
"I regretted the minute I pressed share": a qualitative study of regrets on Facebook

Proceedings of the Seventh Symposium on Usable Privacy and Security
Imagined communities: awareness, information sharing, and privacy on the facebook

PET'06 Proceedings of the 6th international conference on Privacy Enhancing Technologies

The complete picture of the Twitter social graph

Proceedings of the 2012 ACM conference on CoNEXT student workshop
Location tracking via social networking sites

Proceedings of the 5th Annual ACM Web Science Conference
The post anachronism: the temporal dimension of facebook privacy

Proceedings of the 12th ACM workshop on Workshop on privacy in the electronic society
Privacy awareness about information leakage: who knows what about me?

Proceedings of the 12th ACM workshop on Workshop on privacy in the electronic society

Quantified Score

Hi-index	0.00

Visualization

Abstract

Twitter has become one of the most popular microblogging sites for people to broadcast (or "tweet") their thoughts to the world in 140 characters or less. Since these messages are available for public consumption, one may expect these tweets not to contain private or incriminating information. Nevertheless we observe a large number of users who unwittingly post sensitive information about themselves and other people for whom there may be negative consequences. While some awareness exists of such privacy issues on social networks such as Twitter and Facebook, there has been no quantitative, scientific study addressing this problem. In this paper we make three major contributions. First, we characterize the nature of privacy leaks on Twitter to gain an understanding of what types of private information people are revealing on it. We specifically analyze three types of leaks: divulging vacation plans, tweeting under the influence of alcohol, and revealing medical conditions. Second, using this characterization we build automatic classifiers to detect incriminating tweets for these three topics in real time in order to demonstrate the real threat posed to users by, e.g., burglars and law enforcement. Third, we characterize who leaks information and how. We study both self- incriminating primary leaks and secondary leaks that reveal sensitive information about others, as well as the prevalence of leaks in status updates and conversation tweets. We also conduct a cross-cultural study to investigate the prevalence of leaks in tweets originating from the United States, United Kingdom and Singapore. Finally, we discuss how our classification system can be used as a defense mechanism to alert users of potential privacy leaks.