Discovery of significant emerging trends
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Eddi: interactive topic-based browsing of social status streams
UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology
Hip and trendy: Characterizing emerging trends on Twitter
Journal of the American Society for Information Science and Technology
Twitinfo: aggregating and visualizing microblogs for event exploration
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Learning to rank for freshness and relevance
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Proceedings of the fifth ACM international conference on Web search and data mining
Communications of the ACM
Hi-index | 0.00 |
While social media receive a lot of attention from the scientific community in general, there is little work on high recall retrieval of messages relevant to a discussion. Hash tag based search is widely used for data retrieval from social media. This work shows limitations of this approach, because the majority of the relevant messages do not even contain any hash tag, and unpredictable hash tags are used as the conversation evolves in time. To overcome these limitations, we propose an alternative retrieval method. Given an input stream of messages as an example of the discussion, our method extracts the most relevant words from it and queries the social network for more messages with these words. Our method filters messages that do not belong to the discussion using an LDA topic model. We demonstrate this concept on manually built collections of tweets about major sport and music events.