Document clustering based on web search hit counts

  • Authors:
  • Masaya Kaneko;Shusuke Okamoto;Masaki Kohana;You Inayoshi

  • Affiliations:
  • Graduate School of Science and Technology, Seikei University, 3-3-1, Kichijoji-Kitamachi, Musashino-shi, Tokyo, 180-8633, Japan;Graduate School of Science and Technology, Seikei University, 3-3-1, Kichijoji-Kitamachi, Musashino-shi, Tokyo, 180-8633, Japan;Graduate School of Science and Technology, Seikei University, 3-3-1, Kichijoji-Kitamachi, Musashino-shi, Tokyo, 180-8633, Japan;Graduate School of Science and Technology, Seikei University, 3-3-1, Kichijoji-Kitamachi, Musashino-shi, Tokyo, 180-8633, Japan

  • Venue:
  • International Journal of Business Intelligence and Data Mining
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes a web mining method for clustering research documents automatically. Web hit counts of AND-search for two words are used to form a document feature vector. Target documents are clustered using the k-means clustering method twice, in which cosine similarity is used to calculate the distance measure.