Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
On modeling information retrieval with probabilistic inference
ACM Transactions on Information Systems (TOIS)
Large test collection experiments on an operational, interactive system: Okapi at TREC
TREC-2 Proceedings of the second conference on Text retrieval conference
Matrix computations (3rd ed.)
Generalized vector spaces model in information retrieval
SIGIR '85 Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval
Authoritative sources in a hyperlinked environment
Journal of the ACM (JACM)
Information Retrieval
Information Retrieval: Algorithms and Heuristics
Information Retrieval: Algorithms and Heuristics
Modern Information Retrieval
Term Frequency Normalization via Pareto Distributions
Proceedings of the 24th BCS-IRSG European Colloquium on IR Research: Advances in Information Retrieval
The Accessibility Dimension for Structured Document Retrieval
Proceedings of the 24th BCS-IRSG European Colloquium on IR Research: Advances in Information Retrieval
Using graded relevance assessments in IR evaluation
Journal of the American Society for Information Science and Technology
Language Modeling for Information Retrieval
Language Modeling for Information Retrieval
The overlap problem in content-oriented XML retrieval evaluation
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
A parallel derivation of probabilistic information retrieval models
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Hi-index | 0.00 |
In this paper, we present a well-defined general matrix framework for modelling Information Retrieval (IR). In this framework, collections, documents and queries correspond to matrix spaces. Retrieval aspects, such as content, structure and semantics, are expressed by matrices defined in these spaces and by matrix operations applied on them. The dualities of these spaces are identified through the application of frequency-based operations on the proposed matrices and through the investigation of the meaning of their eigenvectors. This allows term weighting concepts used for content-based retrieval, such as term frequency and inverse document frequency, to translate directly to concepts for structure-based retrieval. In addition, concepts such as pagerank, authorities and hubs, determined by exploiting the structural relationships between linked documents, can be defined with respect to the semantic relationships between terms. Moreover, this mathematical framework can be used to express classical and alternative evaluation measures, involving, for instance, the structure of documents, and to further explain and relate IR models and theory. The high level of reusability and abstraction of the framework leads to a logical layer for IR that makes system design and construction significantly more efficient, and thus, better and increasingly personalised systems can be built at lower costs.