Word association norms, mutual information, and lexicography
Computational Linguistics
Query expansion using local and global document analysis
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Focused crawling: a new approach to topic-specific Web resource discovery
WWW '99 Proceedings of the eighth international conference on World Wide Web
On Relevance, Probabilistic Indexing and Information Retrieval
Journal of the ACM (JACM)
The Association Factor in Information Retrieval
Journal of the ACM (JACM)
Authoritative sources in a hyperlinked environment
Journal of the ACM (JACM)
An information-theoretic approach to automatic query expansion
ACM Transactions on Information Systems (TOIS)
Clustering user queries of a search engine
Proceedings of the 10th international conference on World Wide Web
Probabilistic query expansion using query logs
Proceedings of the 11th international conference on World Wide Web
Information Retrieval
Focused Crawling Using Context Graphs
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Automatic scientific text classification using local patterns: KDD CUP 2002 (task 1)
ACM SIGKDD Explorations Newsletter
Analysis of performance variation using query expansion
Journal of the American Society for Information Science and Technology
The Journal of Machine Learning Research
Term extraction + term clustering: an integrated platform for computer-aided terminology
EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
ACM SIGIR Forum
Hi-index | 0.00 |
The study of mining the associated words is not new. Because of its wide ranges of applications, it is still an important issue in Information Retrieval. The existing estimators such as joint probability, words association norm do not consider the density of the words present in each window. In this paper, we incorporate the word density and propose estimator based on word density to measure the association between the words. From various experimental results based on the human judgments and precision collected from search engines, we find that the precision of the estimators could be improved by incorporating word density. For all ranges of the size of the windows, our estimator outperforms all other estimators. We also observe that all these estimators (both existing and proposed one) perform relatively better when the windows contain around five sentences. We also show by using Spearman rank-order correlation coefficient that our estimator returns better quality of the ranking of the associated terms.