Latent variable models and factors analysis
Latent variable models and factors analysis
Journal of the American Society for Information Science
Understanding search engines: mathematical modeling and text retrieval
Understanding search engines: mathematical modeling and text retrieval
Blind Men and Elephants: Six Approaches to TREC data
Information Retrieval
Statistical principal components analysis for retrieval experiments: Research Articles
Journal of the American Society for Information Science and Technology
Hi-index | 0.00 |
Consider information retrieval systems that respond to a query (a natural language statement of a topic, an information need) with an ordered list of 1000 documents from the document collection. From the responses to queries that all express the same topic, one can discern how the words associated with a topic result in particular system behavior. From what is discerned from different topics, one can hypothesize abstract topic factors that influence system performance. An example of such a factor is the specificity of the topic's primary key word. This paper shows that statements about the effect of abstract topic factors on system performance can be supported empirically. A combination of statistical methods is applied to system responses from NIST's Text REtrieval Conference. We analyze each topic using a measure of irrelevant-document exclusion computed for each response and a measure of dissimilarity between relevant-document return orders computed for each pair of responses. We formulate topic factors through graphical comparison of measurements for different topics. Finally, we propose for each topic a four-dimensional summarization that we use to select topic comparisons likely to depict topic factors clearly.