Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Snowball: extracting relations from large plain-text collections
DL '00 Proceedings of the fifth ACM conference on Digital libraries
Extracting Patterns and Relations from the World Wide Web
WebDB '98 Selected papers from the International Workshop on The World Wide Web and Databases
Overview of patent retrieval task at NTCIR-3
PATENT '03 Proceedings of the ACL-2003 workshop on Patent corpus processing - Volume 20
Automatic query generation for patent search
Proceedings of the 18th ACM conference on Information and knowledge management
Patent claim decomposition for improved information extraction
Proceedings of the 2nd international workshop on Patent information retrieval
A Rules and Statistical Learning Based Method for Chinese Patent Information Extraction
WISA '11 Proceedings of the 2011 Eighth Web Information Systems and Applications Conference
Hi-index | 0.00 |
Patents are public and scientific literatures protected by the law, and their abstracts highly contained valuable information. Patent's semantic annotation can effectively protect intellectual property rights and promote corporations' scientific research innovation. Currently, automatic patent annotation mainly used supervised machine learning algorithms, which required abundant expensive labeled patent data. Due to lack of enough labeled Chinese patent data, this paper adopted a semi-supervised machine learning method named co-training, which started from a little labeled data. This method combined keyword extraction with list extraction, and incrementally annotated functional clauses in patent abstract. Experiment results indicated this method can gradually improve the recall without sacrificing the precision.