Algorithms for clustering data
Algorithms for clustering data
The use of MMR, diversity-based reranking for reordering documents and producing summaries
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Finding related pages in the World Wide Web
WWW '99 Proceedings of the eighth international conference on World Wide Web
Authoritative sources in a hyperlinked environment
Journal of the ACM (JACM)
Modern Information Retrieval
Advances in Automatic Text Summarization
Advances in Automatic Text Summarization
SimRank: a measure of structural-context similarity
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Implementation of algorithms for maximum matching on nonbipartite graphs.
Implementation of algorithms for maximum matching on nonbipartite graphs.
The Journal of Machine Learning Research
Finding similar files in large document repositories
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Building implicit links from content for forum search
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Efficient online top-K retrieval with arbitrary similarity measures
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
A Hybrid Approach for XML Similarity
SOFSEM '07 Proceedings of the 33rd conference on Current Trends in Theory and Practice of Computer Science
A simple and fast algorithm for K-medoids clustering
Expert Systems with Applications: An International Journal
It pays to be picky: an evaluation of thread retrieval in online forums
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Trust and nuanced profile similarity in online social networks
ACM Transactions on the Web (TWEB)
Online community search using thread structure
Proceedings of the 18th ACM conference on Information and knowledge management
postingRank: bringing order to web forum postings
AIRS'08 Proceedings of the 4th Asia information retrieval conference on Information retrieval technology
Multi-document summarization via budgeted maximization of submodular functions
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Approximating Maximum Weight Matching in Near-Linear Time
FOCS '10 Proceedings of the 2010 IEEE 51st Annual Symposium on Foundations of Computer Science
Exploiting thread structures to improve smoothing of language models for forum post retrieval
ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Survey: An overview on XML similarity: Background, current trends and future directions
Computer Science Review
Exploiting Forum Thread Structures to Improve Thread Clustering
Proceedings of the 2013 Conference on the Theory of Information Retrieval
Hi-index | 0.00 |
Online forums are becoming a popular way of finding useful information on the web. Search over forums for existing discussion threads so far is limited to keyword-based search due to the minimal effort required on part of the users. However, it is often not possible to capture all the relevant context in a complex query using a small number of keywords. Example-based search that retrieves similar discussion threads given one exemplary thread is an alternate approach that can help the user provide richer context and vastly improve forum search results. In this paper, we address the problem of finding similar threads to a given thread. Towards this, we propose a novel methodology to estimate similarity between discussion threads. Our method exploits the thread structure to decompose threads in to set of weighted overlapping components. It then estimates pairwise thread similarities by quantifying how well the information in the threads are mutually contained within each other using lexical similarities between their underlying components. We compare our proposed methods on real datasets against state-of-the-art thread retrieval mechanisms wherein we illustrate that our techniques outperform others by large margins on popular retrieval evaluation measures such as NDCG, MAP, Precision@k and MRR. In particular, consistent improvements of up to 10% are observed on all evaluation measures.