Extracting spam blogs with co-citation clusters

  • Authors:
  • Kazunari Ishida

  • Affiliations:
  • The University of Shimane, Hamada-shi, Japan

  • Venue:
  • Proceedings of the 17th international conference on World Wide Web
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper reports the estimated number of spam blogs in order to assess their current state in the blogosphere. To extract spam blogs, I developed a traversal method among co-citation clusters of blogs from a spam seed. Spam seeds were collected in terms of high out-degree and spam keyword. According to the experiment, a mixed seed set composed of high out-degree and spam keyword seeds is more effective than individual seed sets in terms of F-Measure. In conclusion, mixed seeds from different methods are effective in improving the F-Measure results of spam extraction with co-citation clusters.