Viewing morphology as an inference process
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Information retrieval as statistical translation
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Relevance based language models
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
A study of smoothing methods for language models applied to Ad Hoc information retrieval
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Model-based feedback in the language modeling approach to information retrieval
Proceedings of the tenth international conference on Information and knowledge management
Similarity measures for tracking information flow
Proceedings of the 14th ACM international conference on Information and knowledge management
A web-based kernel function for measuring the similarity of short text snippets
Proceedings of the 15th international conference on World Wide Web
Generating query substitutions
Proceedings of the 15th international conference on World Wide Web
A translation model for sentence retrieval
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
World knowledge in broad-coverage information filtering
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the 17th international conference on World Wide Web
Identifying Quotations in Reference Works and Primary Materials
ECDL '08 Proceedings of the 12th European conference on Research and Advanced Technology for Digital Libraries
The Evaluation of Sentence Similarity Measures
DaWaK '08 Proceedings of the 10th international conference on Data Warehousing and Knowledge Discovery
Inferring semantic query relations from collective user behavior
Proceedings of the 17th ACM conference on Information and knowledge management
To swing or not to swing: learning when (not) to advertise
Proceedings of the 17th ACM conference on Information and knowledge management
Search advertising using web relevance feedback
Proceedings of the 17th ACM conference on Information and knowledge management
ICADL 08 Proceedings of the 11th International Conference on Asian Digital Libraries: Universal and Ubiquitous Access to Information
Integration of news content into web results
Proceedings of the Second ACM International Conference on Web Search and Data Mining
Consistent phrase relevance measures
Proceedings of the 2nd International Workshop on Data Mining and Audience Intelligence for Advertising
A survey on session detection methods in query logs and a proposal for future evaluation
Information Sciences: an International Journal
PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Collecting fragmentary authors in a digital library
Proceedings of the 9th ACM/IEEE-CS joint conference on Digital libraries
Adaptation of offline vertical selection predictions in the presence of user feedback
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Improving similarity measures for short segments of text
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
Wikipedia-based semantic interpretation for natural language processing
Journal of Artificial Intelligence Research
Exploiting internal and external semantics for the clustering of short texts using world knowledge
Proceedings of the 18th ACM conference on Information and knowledge management
Learning term-weighting functions for similarity measures
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Precomputing search features for fast and accurate query classification
Proceedings of the third ACM international conference on Web search and data mining
Time is of the essence: improving recency ranking using Twitter data
Proceedings of the 19th international conference on World wide web
Mining Historic Query Trails to Label Long and Rare Search Engine Queries
ACM Transactions on the Web (TWEB)
Growing related words from seed via user behaviors: a re-ranking based approach
ACLstudent '10 Proceedings of the ACL 2010 Student Research Workshop
Efficient set-correlation operator inside databases
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Probabilistic first pass retrieval for search advertising: from theory to practice
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Identifying topical authorities in microblogs
Proceedings of the fourth ACM international conference on Web search and data mining
Learning similarity function for rare queries
Proceedings of the fourth ACM international conference on Web search and data mining
Query suggestion for E-commerce sites
Proceedings of the fourth ACM international conference on Web search and data mining
Generating phrasal and sentential paraphrases: A survey of data-driven methods
Computational Linguistics
Location specific summarization of climatic and agricultural trends
Proceedings of the 20th international conference companion on World wide web
A word at a time: computing word relatedness using temporal semantic analysis
Proceedings of the 20th international conference on World wide web
Detecting outlier sections in us congressional legislation
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Out of sight, not out of mind: on the effect of social and physical detachment on information need
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
User Behaviors in Related Word Retrieval and New Word Detection: A Collaborative Perspective
ACM Transactions on Asian Language Information Processing (TALIP)
Web-Based Verification on the Representativeness of Terms Extracted from Single Short Documents
WI-IAT '11 Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 03
ETree: Effective and Efficient Event Modeling for Real-Time Online Social Media Networks
WI-IAT '11 Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01
Query session detection as a cascade
Proceedings of the 20th ACM international conference on Information and knowledge management
Unveiling locations in geo-spatial documents
Proceedings of the 19th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
Quality-aware similarity assessment for entity matching in Web data
Information Systems
Medical event coreference resolution using the UMLS metathesaurus and temporal reasoning
Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium
Summarizing and extracting online public opinion from blog search results
DASFAA'10 Proceedings of the 15th international conference on Database Systems for Advanced Applications - Volume Part I
Supporting collaboration in Wikipedia between language communities
Proceedings of the 4th international conference on Intercultural Collaboration
Investigating the statistical properties of user-generated documents
FQAS'11 Proceedings of the 9th international conference on Flexible Query Answering Systems
Optimizing index for taxonomy keyword search
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Building subjectivity lexicon(s) from scratch for essay data
CICLing'12 Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I
Finding related micro-blogs based on wordnet
DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications
Cognos: crowdsourcing search for topic experts in microblogs
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Improving retrieval of short texts through document expansion
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
A preference learning approach to sentence ordering for multi-document summarization
Information Sciences: an International Journal
Towards efficient similar sentences extraction
IDEAL'12 Proceedings of the 13th international conference on Intelligent Data Engineering and Automated Learning
Measuring semantic relatedness using multilingual representations
SemEval '12 Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation
Joint topic modeling for event summarization across news and social media streams
Proceedings of the 21st ACM international conference on Information and knowledge management
Collaborative ranking: improving the relevance for tail queries
Proceedings of the 21st ACM international conference on Information and knowledge management
ACM SIGKDD Explorations Newsletter
Improving recency ranking using twitter data
ACM Transactions on Intelligent Systems and Technology (TIST) - Special section on twitter and microblogging services, social recommender systems, and CAMRa2010: Movie recommendation in context
The Effect of Social and Physical Detachment on Information Need
ACM Transactions on Information Systems (TOIS)
Multimodal alignment of scholarly documents and their presentations
Proceedings of the 13th ACM/IEEE-CS joint conference on Digital libraries
Computing semantic relatedness using word frequency and layout information of Wikipedia
Proceedings of the 28th Annual ACM Symposium on Applied Computing
From search session detection to search mission detection
Proceedings of the 10th Conference on Open Research Areas in Information Retrieval
Probabilistic semantic similarity measurements for noisy short texts using Wikipedia entities
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
How fresh do you want your search results?
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
An unsupervised transfer learning approach to discover topics for online reputation management
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Hi-index | 0.00 |
Measuring the similarity between documents and queries has been extensively studied in information retrieval. However, there are a growing number of tasks that require computing the similarity between two very short segments of text. These tasks include query reformulation, sponsored search, and image retrieval. Standard text similarity measures perform poorly on such tasks because of data sparseness and the lack of context. In this work, we study this problem from an information retrieval perspective, focusing on text representations and similarity measures. We examine a range of similarity measures, including purely lexical measures, stemming, and language modeling-based measures. We formally evaluate and analyze the methods on a query-query similarity task using 363,822 queries from a web search log. Our analysis provides insights into the strengths and weaknesses of each method, including important tradeoffs between effectiveness and efficiency.