A co-training based method for chinese patent semantic annotation

  • Authors:
  • Xu Chen;Zhiyong Peng;Cheng Zeng

  • Affiliations:
  • Wuhan University, Wuhan, Hubei, China;Wuhan University, Wuhan, Hubei, China;Wuhan University, Wuhan, China

  • Venue:
  • Proceedings of the 21st ACM international conference on Information and knowledge management
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Patents are public and scientific literatures protected by the law, and their abstracts highly contained valuable information. Patent's semantic annotation can effectively protect intellectual property rights and promote corporations' scientific research innovation. Currently, automatic patent annotation mainly used supervised machine learning algorithms, which required abundant expensive labeled patent data. Due to lack of enough labeled Chinese patent data, this paper adopted a semi-supervised machine learning method named co-training, which started from a little labeled data. This method combined keyword extraction with list extraction, and incrementally annotated functional clauses in patent abstract. Experiment results indicated this method can gradually improve the recall without sacrificing the precision.