TwitterMonitor: trend detection over the twitter stream
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Emerging topic detection on Twitter based on temporal and social terms evaluation
Proceedings of the Tenth International Workshop on Multimedia Data Mining
Summarizing microblogs automatically
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Multi-sentence compression: finding shortest paths in word graphs
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Opinosis: a graph-based approach to abstractive summarization of highly redundant opinions
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
ETree: Effective and Efficient Event Modeling for Real-Time Online Social Media Networks
WI-IAT '11 Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01
Summarizing sporting events using twitter
Proceedings of the 2012 ACM international conference on Intelligent User Interfaces
See what's enBlogue: real-time emergent topic identification in social media
Proceedings of the 15th International Conference on Extending Database Technology
A framework for summarizing and analyzing twitter feeds
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Clustering to Improve Microblog Stream Summarization
SYNASC '12 Proceedings of the 2012 14th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing
Hi-index | 0.00 |
Microblogging has shown a massive increase in use over the past couple of years. According to recent statistics, Twitter (the most popular microblogging platform) has over 500 million posts per day. In order to help users manage this information overload or to assess the full information potential of microblogging streams, a few summarization algorithms have been proposed. However, they are designed to work on a stream of posts filtered on a particular keyword, whereas most streams suffer from noise or have posts referring to more than one topic. Because of this, the generated summary is incomplete and even meaningless. We approach the problem of summarizing a stream and propose adding a layer of text clustering before the summarizing step. We first identify the events users are talking about in the stream, we group posts by event and then we continue by clustering each group hierarchically. We show how, by generating an agglomerative hierarchical cluster tree based on the posts and applying a summarization algorithm, the quality of the summary improves.