Learning to cluster web search results
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
The Probabilistic Relevance Framework: BM25 and Beyond
Foundations and Trends in Information Retrieval
Query suggestions in the absence of query logs
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Hi-index | 0.00 |
Some authors of web documents attached keywords as metadata to the documents. These keywords reflected authors' intents for the main topics. To automatically acquire topic-related keywords, this paper proposes a method that mines subtopics as topic-related keywords using simple patterns and the hierarchical structure of candidate strings based on the clusters of relevant documents using web documents for the Japanese and English languages. We created alternative partial-topics from the original topic, and extracted various candidate strings using simple patterns based on each partial-topic and POS tags. We constructed the hierarchical structure of the candidate strings according to the proposed process, and ranked them using this structure and the frequency information for each group of the candidate strings in the area satisfying the diversity requirement of the hierarchical structure. Our method outperformed the baselines, and the results will be useful in various topic annotations.