Computing semantic relatedness using Wikipedia-based explicit semantic analysis
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Short text classification in twitter to improve information filtering
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
The ECIR 2010 large scale hierarchical classification workshop
ACM SIGIR Forum
Representation models for text classification: a comparative analysis over three web document types
Proceedings of the 2nd International Conference on Web Intelligence, Mining and Semantics
Classification of short texts by deploying topical annotations
ECIR'12 Proceedings of the 34th European conference on Advances in Information Retrieval
TCSST: transfer classification of short & sparse text using external data
Proceedings of the 21st ACM international conference on Information and knowledge management
Supervised learning of semantic relatedness
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I
Probabilistic semantic similarity measurements for noisy short texts using Wikipedia entities
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Hi-index | 0.00 |
Recently, more and more short texts (e.g., ads, tweets) appear on the Web. Classifying short texts into a large taxonomy like ODP or Wikipedia category system has become an important mining task to improve the performance of many applications such as contextual advertising and topic detection for micro-blogging. In this paper, we propose a novel multi-stage classification approach to solve the problem. First, explicit semantic analysis is used to add more features for both short texts and categories. Second, we leverage information retrieval technologies to fetch the most relevant categories for an input short text from thousands of candidates. Finally, a SVM classifier is applied on only a few selected categories to return the final answer. Our experimental results show that the proposed method achieved significant improvements on classification accuracy compared with several existing state of art approaches.