COA: finding novel patents through text analysis

Authors:
Mohammad Al Hasan;W. Scott Spangler;Thomas Griffin;Alfredo Alba
Affiliations:
Rensselaer Polytechnic Institute, Troy, NY, USA;IBM, San Jose, CA, USA;IBM, San Jose, CA, USA;IBM, San Jose, CA, USA
Venue:
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2009

Citing 5
Cited 3

The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Authoritative sources in a hyperlinked environment

Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
Patent Mining - Discover y of Business Value from Patent Repositor ies

HICSS '07 Proceedings of the 40th Annual Hawaii International Conference on System Sciences
Information genealogy: uncovering the flow of ideas in non-hyperlinked document databases

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining

A smarter process for sensing the information space

IBM Journal of Research and Development
Latent graphical models for quantifying and predicting patent quality

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Finding nuggets in IP portfolios: core patent mining through textual temporal analysis

Proceedings of the 21st ACM international conference on Information and knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

In recent years, the number of patents filed by the business enterprises in the technology industry are growing rapidly, thus providing unprecedented opportunities for knowledge discovery in patent data. One important task in this regard is to employ data mining techniques to rank patents in terms of their potential to earn money through licensing. Availability of such ranking can substantially reduce enterprise IP (Intellectual Property) management costs. Unfortunately, the existing software systems in the IP domain do not address this task directly. Through our research, we build a patent ranking software, named COA (Claim Originality Analysis) that rates a patent based on its value by measuring the recency and the impact of the important phrases that appear in the "claims" section of a patent. Experiments show that COA produces meaningful ranking when comparing it with other indirect patent evaluation metrics--citation count, patent status, and attorney's rating. In reallife settings, this tool was used by beta-testers in the IBM IP department. Lawyers found it very useful in patent rating, specifically, in highlighting potentially valuable patents in a patent cluster. In this article, we describe the ranking techniques and system architecture of COA. We also present the results that validate its effectiveness.