Hierarchical subtopic mining for topic annotation

  • Authors:
  • Se-Jong Kim;Ki-Young Shin;Jong-Hyeok Lee

  • Affiliations:
  • POSTECH, Pohang, South Korea;POSTECH, Pohang, South Korea;POSTECH, Pohang, South Korea

  • Venue:
  • Proceedings of the sixth international workshop on Exploiting semantic annotations in information retrieval
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Some authors of web documents attached keywords as metadata to the documents. These keywords reflected authors' intents for the main topics. To automatically acquire topic-related keywords, this paper proposes a method that mines subtopics as topic-related keywords using simple patterns and the hierarchical structure of candidate strings based on the clusters of relevant documents using web documents for the Japanese and English languages. We created alternative partial-topics from the original topic, and extracted various candidate strings using simple patterns based on each partial-topic and POS tags. We constructed the hierarchical structure of the candidate strings according to the proposed process, and ranked them using this structure and the frequency information for each group of the candidate strings in the area satisfying the diversity requirement of the hierarchical structure. Our method outperformed the baselines, and the results will be useful in various topic annotations.