Information filtering and information retrieval: two sides of the same coin?
Communications of the ACM - Special issue on information filtering
Query expansion using local and global document analysis
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Incremental relevance feedback for information filtering
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Learning while filtering documents
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Threshold Setting and Performance Optimization in Adaptive Filtering
Information Retrieval
A study of smoothing methods for language models applied to information retrieval
ACM Transactions on Information Systems (TOIS)
Using bayesian priors to combine classifiers for adaptive filtering
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Solving large scale linear prediction problems using stochastic gradient descent algorithms
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Robustness of adaptive filtering methods in a cross-benchmark evaluation
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Voting for candidates: adapting data fusion techniques for an expert search task
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Spam filtering for short messages
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Proceedings of the first workshop on Online social networks
Exploiting internal and external semantics for the clustering of short texts using world knowledge
Proceedings of the 18th ACM conference on Information and knowledge management
Earthquake shakes Twitter users: real-time event detection by social sensors
Proceedings of the 19th international conference on World wide web
The Probabilistic Relevance Framework
The Probabilistic Relevance Framework
Short text classification in twitter to improve information filtering
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Classifying trending topics: a typology of conversation triggers on Twitter
Proceedings of the 20th ACM international conference on Information and knowledge management
Identifying local events by using microblogs as social sensors
Proceedings of the 10th Conference on Open Research Areas in Information Retrieval
Hi-index | 0.00 |
In this paper, we approach the problem of real-time filtering in the Twitter Microblogging platform. We adapt an effective traditional news filtering technique, which uses a text classifier inspired by Rocchio's relevance feedback algorithm, to build and dynamically update a profile of the user's interests in real-time. In our adaptation, we tackle two challenges that are particularly prevalent in Twitter: sparsity and drift. In particular, sparsity stems from the brevity of tweets, while drift occurs as events related to the topic develop or the interests of the user change. First, to tackle the acute sparsity problem, we apply query expansion to derive terms or related tweets for a richer initialisation of the user interests within the profile. Second, to deal with drift, we modify the user profile to balance between the importance of the short-term interests, i.e. emerging subtopics, and the long-term interests in the overall topic. Moreover, we investigate an event detection method from Twitter and newswire streams to predict times at which drift may happen. Through experiments using the TREC Microblog track 2012, we show that our approach is effective for a number of common filtering metrics such as the user's utility, and that it compares favourably with state-of-the-art news filtering baselines. Our results also uncover the impact of different factors on handling topic drifting.