Use of syntactic context to produce term association lists for text retrieval
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
WordNet: a lexical database for English
Communications of the ACM
Query expansion using local and global document analysis
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
The paraphrase search assistant: terminological feedback for iterative information seeking
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Deriving concept hierarchies from text
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
The impact on retrieval effectiveness of skewed frequency distributions
ACM Transactions on Information Systems (TOIS)
Automatic Information Organization and Retrieval.
Automatic Information Organization and Retrieval.
Improving automated requirements trace retrieval: a study of term-based enhancement methods
Empirical Software Engineering
Embellishing text search queries to protect user privacy
Proceedings of the VLDB Endowment
Detecting weak signals for long-term business opportunities using text mining of Web news
Expert Systems with Applications: An International Journal
Hi-index | 0.00 |
Document frequency is used in various applications in Information Retrieval and other related fields. An assumption frequently made is that the document frequency represents a level of the term's specificity. However, empirical results to support this assumption are limited. Therefore, a large-scale experiment was carried out, using multiple corpora, to gain further insight into the relationship between the document frequency and term specificity. The results show that the assumption holds only at the very specific levels that cover the majority of vocabulary. The results also show that a larger corpus is more accurate at estimating the specificity. However, the co-occurrence information is shown to be effective for improving the accuracy when only a small corpus is available.