A theory of term weighting based on exploratory data analysis
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
NTCIR-2 as a Rosetta stone in laboratory experiments of IR systems
Information Processing and Management: an International Journal - Special issue: Cross-language information retrieval
Hi-index | 0.00 |
In spite of long controversy, effectiveness of phrasal indexing is not yet clear. Recently, correlation between query length and effect of phrasal indexing is reported. In this paper, terms extracted from the topic set of the NACSIS test collection 1 are analyzed utilizing statistic tools in order to show distribution characteristics of single word/phrasal terms with regard to relevant/non-relevant documents. Phrasal terms are found to be very good discriminators in general but not all of them are effective as supplemental phrasal terms. A distinction of informative / neutral / destructive phrasal terms is introduced. Retrieval effectiveness is examined utilizing query weight ratio of these three categories of phrasal terms.