Hierarchical subtopic mining for topic annotation

Authors:
Se-Jong Kim;Ki-Young Shin;Jong-Hyeok Lee
Affiliations:
POSTECH, Pohang, South Korea;POSTECH, Pohang, South Korea;POSTECH, Pohang, South Korea
Venue:
Proceedings of the sixth international workshop on Exploiting semantic annotations in information retrieval
Year:
2013

Citing 3
Cited 0

Learning to cluster web search results

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
The Probabilistic Relevance Framework: BM25 and Beyond

Foundations and Trends in Information Retrieval
Query suggestions in the absence of query logs

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

Some authors of web documents attached keywords as metadata to the documents. These keywords reflected authors' intents for the main topics. To automatically acquire topic-related keywords, this paper proposes a method that mines subtopics as topic-related keywords using simple patterns and the hierarchical structure of candidate strings based on the clusters of relevant documents using web documents for the Japanese and English languages. We created alternative partial-topics from the original topic, and extracted various candidate strings using simple patterns based on each partial-topic and POS tags. We constructed the hierarchical structure of the candidate strings according to the proposed process, and ranked them using this structure and the frequency information for each group of the candidate strings in the area satisfying the diversity requirement of the hierarchical structure. Our method outperformed the baselines, and the results will be useful in various topic annotations.