Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Improving Short-Text Classification using Unlabeled Data for Classification Problems
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
A web-based kernel function for measuring the similarity of short text snippets
Proceedings of the 15th international conference on World Wide Web
Measuring semantic similarity between words using web search engines
Proceedings of the 16th international conference on World Wide Web
The Google Similarity Distance
IEEE Transactions on Knowledge and Data Engineering
Clustering short texts using wikipedia
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the 17th international conference on World Wide Web
Collective annotation of Wikipedia entities in web text
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
WikiRelate! computing semantic relatedness using wikipedia
AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
International Journal of Human-Computer Studies
Wikipedia-based semantic interpretation for natural language processing
Journal of Artificial Intelligence Research
Feature generation for text categorization using world knowledge
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Short text classification in twitter to improve information filtering
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
TAGME: on-the-fly annotation of short text fragments (by wikipedia entities)
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Towards effective short text deep classification
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Discovering context: classifying tweets through a semantic transform based on wikipedia
FAC'11 Proceedings of the 6th international conference on Foundations of augmented cognition: directing the future of adaptive systems
Robust disambiguation of named entities in text
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Graph-based collective classification for tweets
Proceedings of the 21st ACM international conference on Information and knowledge management
ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
Improved text annotation with Wikipedia entities
Proceedings of the 28th Annual ACM Symposium on Applied Computing
Harnessing linked knowledge sources for topic classification in social media
Proceedings of the 24th ACM Conference on Hypertext and Social Media
A framework for benchmarking entity-annotation systems
Proceedings of the 22nd international conference on World Wide Web
Short text classification by detecting information path
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Hi-index | 0.00 |
We propose a novel approach to the classification of short texts based on two factors: the use of Wikipedia-based annotators that have been recently introduced to detect the main topics present in an input text, represented via Wikipedia pages, and the design of a novel classification algorithm that measures the similarity between the input text and each output category by deploying only their annotated topics and the Wikipedia link-structure. Our approach waives the common practice of expanding the feature-space with new dimensions derived either from explicit or from latent semantic analysis. As a consequence it is simple and maintains a compact intelligible representation of the output categories. Our experiments show that it is efficient in construction and query time, accurate as state-of-the-art classifiers (see e.g. Phan et al. WWW '08), and robust with respect to concept drifts and input sources.