Identifying automatic posting systems in microblogs

Authors:
Gustavo Laboreiro;Luís Sarmento;Eugénio Oliveira
Affiliations:
Faculdade de Engenharia da Universidade do Porto, DEI, LIACC, Portugal;Faculdade de Engenharia da Universidade do Porto, DEI, LIACC and SAPO Labs Porto, Portugal;Faculdade de Engenharia da Universidade do Porto, DEI, LIACC, Portugal
Venue:
EPIA'11 Proceedings of the 15th Portugese conference on Progress in artificial intelligence
Year:
2011

Citing 10
Cited 1

Opinion mining from noisy text data

International Journal on Document Analysis and Recognition - Special Issue NOISY
Short and tweet: experiments on recommending content from information streams

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
TwitterMonitor: trend detection over the twitter stream

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
@spam: the underground on 140 characters or less

Proceedings of the 17th ACM conference on Computer and communications security
Detecting spam bots in online social networking sites: a machine learning approach

DBSec'10 Proceedings of the 24th annual IFIP WG 11.3 working conference on Data and applications security and privacy
Who is tweeting on Twitter: human, bot, or cyborg?

Proceedings of the 26th Annual Computer Security Applications Conference
LIBSVM: A library for support vector machines

ACM Transactions on Intelligent Systems and Technology (TIST)
Information credibility on twitter

Proceedings of the 20th international conference on World wide web
Detecting and analyzing automated activity on twitter

PAM'11 Proceedings of the 12th international conference on Passive and active measurement
‘twazn me!!! ;(’ automatic authorship analysis of micro-blogging messages

NLDB'11 Proceedings of the 16th international conference on Natural language processing and information systems

Determining language variant in microblog messages

Proceedings of the 28th Annual ACM Symposium on Applied Computing

Quantified Score

Hi-index	0.01

Visualization

Abstract

In this paper we study the problem of identifying systems that automatically inject non-personal messages in micro-blogging message streams, thus potentially biasing results of certain information extraction procedures, such as opinion-mining and trend analysis. We also study several classes of features, namely features based on the time of posting, the client used to post, the presence of links, the user interaction and the writing style. This last class of features, that we introduce here for the first time, is proved to be a top performer, achieving accuracy near the 90%, on par with the best features previously used for this task.