Term clustering of syntactic phrases
SIGIR '90 Proceedings of the 13th annual international ACM SIGIR conference on Research and development in information retrieval
Constant interaction-time scatter/gather browsing of very large document collections
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Reexamining the cluster hypothesis: scatter/gather on retrieval results
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Grouper: a dynamic clustering interface to Web search results
WWW '99 Proceedings of the eighth international conference on World Wide Web
Mining the peanut gallery: opinion extraction and semantic classification of product reviews
WWW '03 Proceedings of the 12th international conference on World Wide Web
Introduction to special issue on machine learning approaches to shallow parsing
The Journal of Machine Learning Research
Ontologies Improve Text Document Clustering
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Three generative, lexicalised models for statistical parsing
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Learning to cluster web search results
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Text classification and named entities for new event detection
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Frequency estimates for statistical word similarity measures
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Using the web to overcome data sparseness
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
A web-based kernel function for measuring the similarity of short text snippets
Proceedings of the 15th international conference on World Wide Web
ACM SIGIR Forum
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Learning semantic classes for word sense disambiguation
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Novel association measures using web search with double checking
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Measuring semantic similarity between words using web search engines
Proceedings of the 16th international conference on World Wide Web
Clustering short texts using wikipedia
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the 17th international conference on World Wide Web
Enhancing text clustering by leveraging Wikipedia semantics
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Efficient Phrase-Based Document Similarity for Clustering
IEEE Transactions on Knowledge and Data Engineering
AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Computing semantic relatedness using Wikipedia-based explicit semantic analysis
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Feature generation for text categorization using world knowledge
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Query segmentation based on eigenspace similarity
ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
Similarity measures for short segments of text
ECIR'07 Proceedings of the 29th European conference on IR research
Prototype hierarchy based clustering for the categorization and navigation of web collections
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Short text classification in twitter to improve information filtering
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Beyond the bag-of-words paradigm to enhance information retrieval applications
Proceedings of the Fourth International Conference on SImilarity Search and APplications
Ontology enhancement and concept granularity learning: keeping yourself current and adaptive
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Social analytics for personalization in work environments
WAIM'11 Proceedings of the 12th international conference on Web-age information management
Transferring topical knowledge from auxiliary long texts for short text clustering
Proceedings of the 20th ACM international conference on Information and knowledge management
Summarizing web forum threads based on a latent topic propagation process
Proceedings of the 20th ACM international conference on Information and knowledge management
Large-scale question classification in cQA by leveraging Wikipedia semantic knowledge
Proceedings of the 20th ACM international conference on Information and knowledge management
Enhancing accessibility of microblogging messages using semantic knowledge
Proceedings of the 20th ACM international conference on Information and knowledge management
Topical clustering of search results
Proceedings of the fifth ACM international conference on Web search and data mining
Enriching short text representation in microblog for clustering
Frontiers of Computer Science in China
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Athena: text mining based discovery of scientific workflows in disperse repositories
RED'10 Proceedings of the Third international conference on Resource Discovery
Discovering collective viewpoints on micro-blogging events based on community and temporal aspects
ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part I
Collective viewpoint identification of low-level participation
APWeb'12 Proceedings of the 14th Asia-Pacific international conference on Web Technologies and Applications
Short text classification improved by learning multi-granularity topics
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
eTrust: understanding trust evolution in an online world
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Short text classification using very few words
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Personalized video recommendation based on viewing history with the study on YouTube
Proceedings of the 4th International Conference on Internet Multimedia Computing and Service
Topic-driven reader comments summarization
Proceedings of the 21st ACM international conference on Information and knowledge management
TCSST: transfer classification of short & sparse text using external data
Proceedings of the 21st ACM international conference on Information and knowledge management
Exploiting social relations for sentiment analysis in microblogging
Proceedings of the sixth ACM international conference on Web search and data mining
Whoo.ly: facilitating information seeking for hyperlocal communities using social media
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Enhancing short text clustering with small external repositories
AusDM '11 Proceedings of the Ninth Australasian Data Mining Conference - Volume 121
Unsupervised sentiment analysis with emotional signals
Proceedings of the 22nd international conference on World Wide Web
Probabilistic semantic similarity measurements for noisy short texts using Wikipedia entities
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
On sparsity and drift for effective real-time filtering in microblogs
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Followee recommendation based on text analysis of micro-blogging activity
Information Systems
Identification of collective viewpoints on microblogs
Data & Knowledge Engineering
Improving question retrieval in community question answering using world knowledge
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Social spammer detection in microblogging
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
An efficient Particle Swarm Optimization approach to cluster short texts
Information Sciences: an International Journal
Journal of Information Science
Hi-index | 0.00 |
Clustering of short texts, such as snippets, presents great challenges in existing aggregated search techniques due to the problem of data sparseness and the complex semantics of natural language. As short texts do not provide sufficient term occurring information, traditional text representation methods, such as ``bag of words" model, have several limitations when directly applied to short texts tasks. In this paper, we propose a novel framework to improve the performance of short texts clustering by exploiting the internal semantics from original text and external concepts from world knowledge. The proposed method employs a hierarchical three-level structure to tackle the data sparsity problem of original short texts and reconstruct the corresponding feature space with the integration of multiple semantic knowledge bases -- Wikipedia and WordNet. Empirical evaluation with Reuters and real web dataset demonstrates that our approach is able to achieve significant improvement as compared to the state-of-the-art methods.