SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Extending WHIRL with background knowledge for improved text classification
Information Retrieval
Proceedings of the 17th international conference on World Wide Web
Exploiting internal and external semantics for the clustering of short texts using world knowledge
Proceedings of the 18th ACM conference on Information and knowledge management
Short text classification improved by learning multi-granularity topics
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Steeler nation, 12th man, and boo birds: classifying Twitter user interests using time series
Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
Short text classification by detecting information path
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Research on adaptive classification algorithm based on non-segment and classified-centre-vector
International Journal of Intelligent Information and Database Systems
Hi-index | 0.00 |
We propose a simple, scalable, and non-parametric approach for short text classification. Leveraging the well studied and scalable Information Retrieval (IR) framework, our approach mimics human labeling process for a piece of short text. It first selects the most representative and topical-indicative words from a given short text as query words, and then searches for a small set of labeled short texts best matching the query words. The predicted category label is the majority vote of the search results. Evaluated on a collection of more than 12K Web snippets, the proposed approach achieves comparable classification accuracy with the baseline Maximum Entropy classifier using as few as 3 query words and top-5 best matching search hits. Among the four query word selection schemes proposed and evaluated in our experiments, term frequency together with clarity gives the best classification accuracy.