Credibility ranking of tweets during high impact events

  • Authors:
  • Aditi Gupta;Ponnurangam Kumaraguru

  • Affiliations:
  • Indraprastha Institute of Information Technology, Delhi, India;Indraprastha Institute of Information Technology, Delhi, India

  • Venue:
  • Proceedings of the 1st Workshop on Privacy and Security in Online Social Media
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Twitter has evolved from being a conversation or opinion sharing medium among friends into a platform to share and disseminate information about current events. Events in the real world create a corresponding spur of posts (tweets) on Twitter. Not all content posted on Twitter is trustworthy or useful in providing information about the event. In this paper, we analyzed the credibility of information in tweets corresponding to fourteen high impact news events of 2011 around the globe. From the data we analyzed, on average 30% of total tweets posted about an event contained situational information about the event while 14% was spam. Only 17% of the total tweets posted about the event contained situational awareness information that was credible. Using regression analysis, we identified the important content and sourced based features, which can predict the credibility of information in a tweet. Prominent content based features were number of unique characters, swear words, pronouns, and emoticons in a tweet, and user based features like the number of followers and length of username. We adopted a supervised machine learning and relevance feedback approach using the above features, to rank tweets according to their credibility score. The performance of our ranking algorithm significantly enhanced when we applied re-ranking strategy. Results show that extraction of credible information from Twitter can be automated with high confidence.