Viewing morphology as an inference process
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Authoritative sources in a hyperlinked environment
Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Authoritative sources in a hyperlinked environment
Journal of the ACM (JACM)
A probabilistic model of information retrieval: development and comparative experiments
Information Processing and Management: an International Journal
Finding authorities and hubs from link structures on the World Wide Web
Proceedings of the 10th international conference on World Wide Web
ACM SIGIR Forum
Co-clustering documents and words using bipartite spectral graph partitioning
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Cluster-based retrieval using language models
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
PageRank without hyperlinks: structural re-ranking using links induced by language models
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Improving web search results using affinity graph
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Re-ranking method based on inter-document distances
Information Processing and Management: an International Journal
Regularizing ad hoc retrieval scores
Proceedings of the 14th ACM international conference on Information and knowledge management
The SMART Retrieval System—Experiments in Automatic Document Processing
The SMART Retrieval System—Experiments in Automatic Document Processing
Google's PageRank and Beyond: The Science of Search Engine Rankings
Google's PageRank and Beyond: The Science of Search Engine Rankings
Respect my authority!: HITS without hyperlinks, utilizing cluster-based language models
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Hi-index | 0.00 |
The use of query-independent knowledge to improve the ranking of documents in information retrieval has proven very effective in the context of web search. This query-independent knowledge is derived from an analysis of the graph structure of hypertext links between documents. However, there are many cases where explicit hypertext links are absent or sparse, e.g. corporate Intranets. Previous work has sought to induce a graph link structure based on various measures of similarity between documents. After inducing these links, standard link analysis algorithms, e.g. PageRank, can then be applied. In this paper, we propose and examine an alternative approach to derive query-independent knowledge, which is not based on link analysis. Instead, we analyze each document independently and calculate a "specificity" score, based on (i) normalized inverse document frequency, and (ii) term entropies. Two re-ranking strategies, i.e. hard cutoff and soft cutoff, are then discussed to utilize our query-independent "specificity" scores. Experiments on standard TREC test sets show that our re-ranking algorithms produce gains in mean reciprocal rank of about 4%, and 4% to 6% gains in precision at 5 and 10, respectively, when using the collection of TREC disk 4 and queries from TREC 8 ad hoc topics. Empirical tests demonstrate that the entropy-based algorithm produces stable results across (i) retrieval models, (ii) query sets, and (iii) collections.