TwitterEcho: a distributed focused crawler to support open research with twitter data

Authors:
Matko Boanjak;Eduardo Oliveira;José Martins;Eduarda Mendes Rodrigues;Luís Sarmento
Affiliations:
University of Porto, Porto, Portugal;University of Porto, Porto, Portugal;University of Porto, Porto, Portugal;University of Porto, Porto, Portugal;Sapo.pt - Portugal Telecom, Lisbon, Portugal
Venue:
Proceedings of the 21st international conference companion on World Wide Web
Year:
2012

Citing 9
Cited 4

Why we twitter: understanding microblogging usage and communities

Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis
Introduction to Information Retrieval

Introduction to Information Retrieval
Characterizing debate performance via aggregated twitter sentiment

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
What is Twitter, a social network or a news media?

Proceedings of the 19th international conference on World wide web
Hashtag retrieval in a microblogging environment

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Outtweeting the twitterers - predicting information cascades in microblogs

WOSN'10 Proceedings of the 3rd conference on Online social networks
Characterization of the twitter @replies network: are user ties social or topical?

SMUC '10 Proceedings of the 2nd international workshop on Search and mining user-generated contents
we.b: the web of short urls

Proceedings of the 20th international conference on World wide web
Liars and saviors in a sentiment annotated corpus of comments to political debates

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2

Automated Twitter data collecting tool for data mining in social network

Proceedings of the 2012 ACM Research in Applied Computation Symposium
Automated Twitter data collecting tool and case study with rule-based analysis

Proceedings of the 14th International Conference on Information Integration and Web-based Applications & Services
CUVIM: extracting fresh information from social network

WAIM'13 Proceedings of the 14th international conference on Web-Age Information Management
Towards social data platform: automatic topic-focused monitor for twitter stream

Proceedings of the VLDB Endowment

Quantified Score

Hi-index	0.00

Visualization

Abstract

Modern social network analysis relies on vast quantities of data to infer new knowledge about human relations and communication. In this paper we describe TwitterEcho, an open source Twitter crawler for supporting this kind of research, which is characterized by a modular distributed architecture. Our crawler enables researchers to continuously collect data from particular user communities, while respecting Twitter's imposed limits. We present the core modules of the crawling server, some of which were specifically designed to focus the crawl on the Portuguese Twittosphere. Additional modules can be easily implemented, thus changing the focus to a different community. Our evaluation of the system shows high crawling performance and coverage.