Adapting Support Vector Machines for F-term-based Classification of Patents

  • Authors:
  • Yaoyong Li;Kalina Bontcheva

  • Affiliations:
  • University of Sheffield, UK;University of Sheffield, UK

  • Venue:
  • ACM Transactions on Asian Language Information Processing (TALIP)
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Support Vector Machines (SVM) have obtained state-of-the-art results on many applications including document classification. However, previous works on applying SVMs to the F-term patent classification task did not obtain as good results as other learning algorithms such as kNN. This is due to the fact that F-term patent classification is different from conventional document classification in several aspects, mainly because it is a multiclass, multilabel classification problem with semi-structured documents and multi-faceted hierarchical categories. This article describes our SVM-based system and several techniques we developed successfully to adapt SVM for the specific features of the F-term patent classification task. We evaluate the techniques using the NTCIR-6 F-term classification terms assigned to Japanese patents. Moreover, our system participated in the NTCIR-6 patent classification evaluation and obtained the best results according to two of the three metrics used for task performance evaluation. Following the NTCIR-6 participation, we developed two new techniques, which achieved even better scores using all three NTCIR-6 metrics, effectively outperforming all participating systems. This article presents this new work and the experimental results that demonstrate the benefits of the latest approach.