Assessing agreement on classification tasks: the kappa statistic
Computational Linguistics
Automatic feedback using past queries: social searching?
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Foundations of statistical natural language processing
Foundations of statistical natural language processing
Learning Algorithms for Keyphrase Extraction
Information Retrieval
Domain-Specific Keyphrase Extraction
IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
Sequential conditional Generalized Iterative Scaling
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Learning to rank using gradient descent
ICML '05 Proceedings of the 22nd international conference on Machine learning
Finding advertising keywords on web pages
Proceedings of the 15th international conference on World Wide Web
A web-based kernel function for measuring the similarity of short text snippets
Proceedings of the 15th international conference on World Wide Web
Generating query substitutions
Proceedings of the 15th international conference on World Wide Web
Coherent keyphrase extraction via web mining
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Similarity measures for short segments of text
ECIR'07 Proceedings of the 29th European conference on IR research
Proceedings of the 17th international conference on World Wide Web
Consistent phrase relevance measures
Proceedings of the 2nd International Workshop on Data Mining and Audience Intelligence for Advertising
Towards a Novel Association Measure via Web Search Results Mining
PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Web Search Clustering and Labeling with Hidden Topics
ACM Transactions on Asian Language Information Processing (TALIP)
Large-scale computation of distributional similarities for queries
NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
Clustering queries for better document ranking
Proceedings of the 18th ACM conference on Information and knowledge management
Learning term-weighting functions for similarity measures
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Precomputing search features for fast and accurate query classification
Proceedings of the third ACM international conference on Web search and data mining
Growing related words from seed via user behaviors: a re-ranking based approach
ACLstudent '10 Proceedings of the ACL 2010 Student Research Workshop
Organizing query completions for web search
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
German encyclopedia alignment based on information retrieval techniques
ECDL'10 Proceedings of the 14th European conference on Research and advanced technology for digital libraries
Empirical study of topic modeling in Twitter
Proceedings of the First Workshop on Social Media Analytics
User Behaviors in Related Word Retrieval and New Word Detection: A Collaborative Perspective
ACM Transactions on Asian Language Information Processing (TALIP)
Transferring topical knowledge from auxiliary long texts for short text clustering
Proceedings of the 20th ACM international conference on Information and knowledge management
Summarizing and extracting online public opinion from blog search results
DASFAA'10 Proceedings of the 15th international conference on Database Systems for Advanced Applications - Volume Part I
Short text classification improved by learning multi-granularity topics
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
CluChunk: clustering large scale user-generated content incorporating chunklet information
Proceedings of the 1st International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications
Measuring semantic relatedness using multilingual representations
SemEval '12 Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation
TCSST: transfer classification of short & sparse text using external data
Proceedings of the 21st ACM international conference on Information and knowledge management
Extended information inference model for unsupervised categorization of web short texts
Journal of Information Science
Multimodal alignment of scholarly documents and their presentations
Proceedings of the 13th ACM/IEEE-CS joint conference on Digital libraries
Enhancing short text clustering with small external repositories
AusDM '11 Proceedings of the Ninth Australasian Data Mining Conference - Volume 121
Short text classification by detecting information path
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Exploiting topic tracking in real-time tweet streams
Proceedings of the 2013 international workshop on Mining unstructured big data using natural language processing
Improving short text classification using public search engines
IUKM'13 Proceedings of the 2013 international conference on Integrated Uncertainty in Knowledge Modelling and Decision Making
An efficient Particle Swarm Optimization approach to cluster short texts
Information Sciences: an International Journal
Hi-index | 0.00 |
In this paper we improve previous work on measuring the similarity of short segments of text in two ways. First, we introduce a Web-relevance similarity measure and demonstrate its effectiveness. This measure extends the Web-kernel similarity function introduced by Sahami and Heilman (2006) by using relevance weighted inner-product of term occurrences rather than TF×IDF. Second, we show that one can further improve the accuracy of similarity measures by using a machine learning approach. Our methods outperform other state-of-the-art methods in a general query suggestion task for multiple evaluation metrics.