Benchmarking Attribute Selection Techniques for Discrete Class Data Mining
IEEE Transactions on Knowledge and Data Engineering
An empirical study on learning to rank of tweets
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Ranking Approaches for Microblog Search
WI-IAT '10 Proceedings of the 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
#TwitterSearch: a comparison of microblog search and web search
Proceedings of the fourth ACM international conference on Web search and data mining
Incorporating query expansion and quality indicators in searching microblog posts
ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
A weighted multi-factor algorithm for microblog search
AMT'11 Proceedings of the 7th international conference on Active media technology
Enhancing naive bayes with various smoothing methods for short text classification
Proceedings of the 21st international conference companion on World Wide Web
DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications - Volume Part I
Conversation retrieval for microblogging sites
Information Retrieval
Hi-index | 0.00 |
We investigate in this paper information retrieval in microblogs exploiting different state-of-the-art features. Microbloggers, besides posting microblogs, search for fresh and relevant information related to their interests, by submitting a query to a microblog search engine. The majority of approaches that collect information from microblogs exploit features such as the recency of the microblog, the authority of his/her author... to improve the quality of their results. In this paper, we evaluated some of the state-of-the-art features to determine those that discriminate relevant from irrelevant microblogs given an information need. Then, we used the selected features to learn models to determine their effectiveness in a microblog search task. We conducted a series of experiments using the dataset and topics of the TREC Microblog 2011 and 2012 tracks. Results show that content, hypertextuality, and recency are the best predictors of relevance. We also found that Naive Bayes was the most effective learning approach for this type of classification.