Chinese weblog pages classification based on folksonomy and support vector machines

  • Authors:
  • Xiaoyue Wang;Rujiang Bai;Junhua Liao

  • Affiliations:
  • Shandong University of Technology, China;Shandong University of Technology, China;Shandong University of Technology, China

  • Venue:
  • AIS-ADM'07 Proceedings of the 2nd international conference on Autonomous intelligent systems: agents and data mining
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

For centuries, classification has been used to provide context and direction in any aspect of human knowledge. Standard machine learning techniques like support vector machines and related large margin methods have been successfully applied for this task. Unfortunately, automatic classifiers often conduct misclassifications. Folksonomy, a new manual classification scheme based on tagging efforts of users with freely chosen keywords can effective resolve this problem. In folksonomy, a user attaches tags to an item for their own classification, and they reflect many one's viewpoints. Since tags are chosen from users' vocabulary and contain many one's viewpoints, classification results are easy to understand for ordinary users. Even though the scalability of folksonomy is much higher than the other manual classification schemes, the method cannot deal with tremendous number of items such as whole weblog articles on the Internet. For the purpose of solving this problem, we propose a new classification method FSVMC (folisonomy and support vector machine classifier). The FSVMC uses support vector machines as a Tag-agent which is a program to determine whether a particular tag should be attached to a weblog page and Folksonomy dedicates to categorize the weblog articles. In addition, we propose a method to create a candidate tag database which is a list of tags that may be attached to weblog pages. Experimental results indicate our method is more flexible and effective than traditional methods.