Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Time is of the essence: improving recency ranking using Twitter data
Proceedings of the 19th international conference on World wide web
Earthquake shakes Twitter users: real-time event detection by social sensors
Proceedings of the 19th international conference on World wide web
HeidelTime: High quality rule-based extraction and normalization of temporal expressions
SemEval '10 Proceedings of the 5th International Workshop on Semantic Evaluation
Ranking Approaches for Microblog Search
WI-IAT '10 Proceedings of the 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Towards detecting influenza epidemics by analyzing Twitter messages
Proceedings of the First Workshop on Social Media Analytics
Information search and retrieval in microblogs
Journal of the American Society for Information Science and Technology
Medical case-driven classification of microblogs: characteristics and annotation
Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium
Correlating financial time series with micro-blogging activity
Proceedings of the fifth ACM international conference on Web search and data mining
Twitter catches the flu: detecting influenza epidemics using Twitter
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Efficient jaccard-based diversity analysis of large document collections
Proceedings of the 21st ACM international conference on Information and knowledge management
SemaFor: semantic document indexing using semantic forests
Proceedings of the 21st ACM international conference on Information and knowledge management
Supporting temporal analytics for health-related events in microblogs
Proceedings of the 21st ACM international conference on Information and knowledge management
Hi-index | 0.00 |
A microblogging service like Twitter continues to surge in importance as a means of sharing information in social networks. In the medical domain, several works have shown the potential of detecting public health events (i.e., infectious disease outbreaks) using Twitter messages or tweets. Given its real-time nature, Twitter can enhance early outbreak warning for public health authorities in order that a rapid response can take place. Most of previous works on detecting outbreaks in Twitter simply analyze tweets matched disease names and/or locations of interests. However, the effectiveness of such method is limited for two main reasons. First, disease names are highly ambiguous, i.e., referring slangs or non health-related contexts. Second, the characteristics of infectious diseases are highly dynamic in time and place, namely, strongly time-dependent and vary greatly among different regions. In this paper, we propose to analyze the temporal diversity of tweets during the known periods of real-world outbreaks in order to gain insight into a temporary focus on specific events. More precisely, our objective is to understand whether the temporal diversity of tweets can be used as indicators of outbreak events, and to which extent. We employ an efficient algorithm based on sampling to compute the diversity statistics of tweets at particular time. To this end, we conduct experiments by correlating temporal diversity with the estimated event magnitude of 14 real-world outbreak events manually created as ground truth. Our analysis shows that correlation results are diverse among different outbreaks, which can reflect the characteristics (severity and duration) of outbreaks.