Document classification based on web search hit counts

  • Authors:
  • Masaya Kaneko;Shusuke Okamoto;Masaki Kohana;You Inayoshi

  • Affiliations:
  • Seikei University, Tokyo, Japan;Seikei University, Tokyo, Japan;Seikei University, Tokyo, Japan;Seikei University, Tokyo, Japan

  • Venue:
  • Proceedings of the 14th International Conference on Information Integration and Web-based Applications & Services
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes a web mining method to classify research documents automatically. Web hit counts of AND-search on two words are used to form a document vector. Target documents are classified with a result of k-means clustering method, in which cosine similarity is used to calculate a distance.