A three-phase method for patent classification

Authors:
Yen-Liang Chen;Yuan-Che Chang
Affiliations:
-;-
Venue:
Information Processing and Management: an International Journal
Year:
2012

Citing 22
Cited 0

Term-weighting approaches in automatic text retrieval

Information Processing and Management: an International Journal
Comparing and combining the effectiveness of latent semantic indexing and the ordinary vector space model for information retrieval

Information Processing and Management: an International Journal
Automatic structuring and retrieval of large text files

Communications of the ACM
An algorithmic framework for performing collaborative filtering

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
BoosTexter: A Boosting-based Systemfor Text Categorization

Machine Learning - Special issue on information retrieval
A vector space model for automatic indexing

Communications of the ACM
Modern Information Retrieval

Modern Information Retrieval
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
High-performing feature selection for text classification

Proceedings of the eleventh international conference on Information and knowledge management
Using Taxonomy, Discriminants, and Signatures for Navigating in Text Databases

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Scalable feature selection, classification and signature generation for organizing large text databases into hierarchical topic taxonomies

The VLDB Journal — The International Journal on Very Large Data Bases
Text categorization by boosting automatically extracted concepts

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Automated categorization in the international patent classification

ACM SIGIR Forum
Hierarchical document categorization with support vector machines

Proceedings of the thirteenth ACM international conference on Information and knowledge management
Toward Integrating Feature Selection Algorithms for Classification and Clustering

IEEE Transactions on Knowledge and Data Engineering
Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data

Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data
Patent document categorization based on semantic structural information

Information Processing and Management: an International Journal
Enhancing Text Retrieval Performance using Conceptual Ontological Graph

ICDMW '06 Proceedings of the Sixth IEEE International Conference on Data Mining - Workshops
Adapting Support Vector Machines for F-term-based Classification of Patents

ACM Transactions on Asian Language Information Processing (TALIP)
Text Clustering with Feature Selection by Using Statistical Data

IEEE Transactions on Knowledge and Data Engineering
Text categorization with class-based and corpus-based keyword selection

ISCIS'05 Proceedings of the 20th international conference on Computer and Information Sciences
Self organization of a massive document collection

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

An automatic patent categorization system would be invaluable to individual inventors and patent attorneys, saving them time and effort by quickly identifying conflicts with existing patents. In recent years, it has become more and more common to classify all patent documents using the International Patent Classification (IPC), a complex hierarchical classification system comprised of eight sections, 128 classes, 648 subclasses, about 7200 main groups, and approximately 72,000 subgroups. So far, however, no patent categorization method has been developed that can classify patents down to the subgroup level (the bottom level of the IPC). Therefore, this paper presents a novel categorization method, the three phase categorization (TPC) algorithm, which classifies patents down to the subgroup level with reasonable accuracy. The experimental results for the TPC algorithm, using the WIPO-alpha collection, indicate that our classification method can achieve 36.07% accuracy at the subgroup level. This is approximately a 25,764-fold improvement over a random guess.