Topic structure mining using pagerank without hyperlinks

  • Authors:
  • Hiroyuki Toda;Ko Fujimura;Ryoji Kataoka;Hiroyuki Kitagawa

  • Affiliations:
  • NTT Cyber Solutions Laboratories, NTT Corporation, Kanagawa, Japan;NTT Cyber Solutions Laboratories, NTT Corporation, Kanagawa, Japan;NTT Cyber Solutions Laboratories, NTT Corporation, Kanagawa, Japan;Graduate School of Systems and Information Engineering

  • Venue:
  • ICADL'06 Proceedings of the 9th international conference on Asian Digital Libraries: achievements, Challenges and Opportunities
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper proposes a novel text mining method for any given document set. It is based on PageRank-based centrality scores within the graph structure generated from the similarity of all document pairs. Evaluations using a newspaper collection show that the proposed approach yields much better performance in terms of main topic identification and topical clustering than the baseline method. Furthermore, we show an example of document set visualization that offers novel document browsing through the topic structure. Experiments show that our topic structure mining method is useful for user-oriented document selection.