A retrospective study of a hybrid document-context based retrieval model

Authors:
H. C. Wu;Robert W. P. Luk;K. F. Wong;K. L. Kwok
Affiliations:
Department of Computing, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong;Department of Computing, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong;Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong;Department of Computer Science, Queen's College, City University of New York, 65-30 Kissena Boulevard Flushing, NY 11367-1597, USA
Venue:
Information Processing and Management: an International Journal
Year:
2007

Citing 34
Cited 5

Soft evaluation of Boolean search queries in information retrieval systems

Information Technology Research Development Applications
Adaptive linear information retrieval models

SIGIR '87 Proceedings of the 10th annual international ACM SIGIR conference on Research and development in information retrieval
On ordered weighted averaging aggregation operators in multicriteria decisionmaking

IEEE Transactions on Systems, Man and Cybernetics
Preference structure, inference and set-oriented retrieval

SIGIR '91 Proceedings of the 14th annual international ACM SIGIR conference on Research and development in information retrieval
Extended Boolean models

Information retrieval
Probabilistic retrieval based on staged logistic regression

SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Approaches to passage retrieval in full text information systems

SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Passage-level evidence in document retrieval

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Some inconsistencies and misidentified modeling assumptions in probabilistic information retrieval

ACM Transactions on Information Systems (TOIS)
A trainable document summarizer

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Generalized vector spaces model in information retrieval

SIGIR '85 Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval
Passage retrieval revisited

Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Analyses of multiple evidence combination

Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
A language modeling approach to information retrieval

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Efficient passage ranking for document databases

ACM Transactions on Information Systems (TOIS)
Improving the effectiveness of information retrieval with local context analysis

ACM Transactions on Information Systems (TOIS)
Extended Boolean information retrieval

Communications of the ACM
A probabilistic model of information retrieval: development and comparative experiments

Information Processing and Management: an International Journal
A unified mathematical definition of classical information retrieval

Journal of the American Society for Information Science
Document language models, query models, and risk minimization for information retrieval

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Relevance based language models

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Passage retrieval based on language models

Proceedings of the eleventh international conference on Information and knowledge management
Towards context sensitive information inference

Journal of the American Society for Information Science and Technology - Mathematical, logical, and formal methods in information retrieval
A comparison of Chinese document indexing strategies and retrieval models

ACM Transactions on Asian Language Information Processing (TALIP)
A comparison of various approaches for using probabilistic dependencies in language modeling

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
An empirical study of smoothing techniques for language modeling

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
A study of smoothing methods for language models applied to information retrieval

ACM Transactions on Information Systems (TOIS)
Dependence language model for information retrieval

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
A Markov random field model for term dependencies

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
A retrospective study of probabilistic context-based retrieval

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Query expansion using term relationships in language models for information retrieval

Proceedings of the 14th ACM international conference on Information and knowledge management
On document relevance and lexical cohesion between query terms

Information Processing and Management: an International Journal
Interpreting TF-IDF term weights as making relevance decisions

ACM Transactions on Information Systems (TOIS)

Re-examining the effects of adding relevance information in a relevance feedback environment

Information Processing and Management: an International Journal
Interpreting TF-IDF term weights as making relevance decisions

ACM Transactions on Information Systems (TOIS)
Effectively Searching Maps in Web Documents

ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
Peer-based relay scheme of collaborative filtering for research literature

OTM'11 Proceedings of the 2011th Confederated international conference on On the move to meaningful internet systems - Volume Part I
A split-list approach for relevance feedback in information retrieval

Information Processing and Management: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes our novel retrieval model that is based on contexts of query terms in documents (i.e., document contexts). Our model is novel because it explicitly takes into account of the document contexts instead of implicitly using the document contexts to find query expansion terms. Our model is based on simulating a user making relevance decisions, and it is a hybrid of various existing effective models and techniques. It estimates the relevance decision preference of a document context as the log-odds and uses smoothing techniques as found in language models to solve the problem of zero probabilities. It combines these estimated preferences of document contexts using different types of aggregation operators that comply with different relevance decision principles (e.g., aggregate relevance principle). Our model is evaluated using retrospective experiments (i.e., with full relevance information), because such experiments can (a) reveal the potential of our model, (b) isolate the problems of the model from those of the parameter estimation, (c) provide information about the major factors affecting the retrieval effectiveness of the model, and (d) show that whether the model obeys the probability ranking principle. Our model is promising as its mean average precision is 60-80% in our experiments using different TREC ad hoc English collections and the NTCIR-5 ad hoc Chinese collection. Our experiments showed that (a) the operators that are consistent with aggregate relevance principle were effective in combining the estimated preferences, and (b) that estimating probabilities using the contexts in the relevant documents can produce better retrieval effectiveness than using the entire relevant documents.