Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
SIGIR '80 Proceedings of the 3rd annual ACM conference on Research and development in information retrieval
Hi-index | 0.00 |
This study investigated two hypotheses concerning the use of anaphors in information retrieval. The first hypothesis, that anaphors tend to refer to integral concepts rather than to peripheral concepts, was well supported. Two samples of documents, one in psychology and the other in computer science, were examined by subject experts who judged the centrality of phrases which were referred to anaphorically. The second hypothesis, that various term weighting schemes are affected differently by anaphoric resolution, was also well supported. It was found that schemes which incorporate document length into the calculations produce much smaller increases in term weights for terms occurring in anaphoric resolutions than do those which do not consider document length. It is concluded that although anaphoric resolution has potential for better representing the “aboutness” of a document, care must be taken in choosing both the anaphoric classes to be resolved and the term weighting schemes to be used in measuring a document's topicality.