Automatic domain terminology extraction using graph mutual reinforcement

  • Authors:
  • Jingjing Kang;Xiaoyong Du;Tao Liu;He Hu

  • Affiliations:
  • Key Labs of Data Engineering and Knowledge Engineering, Beijing, China and School of Information, Renmin University of China, Beijing, China;Key Labs of Data Engineering and Knowledge Engineering, Beijing, China and School of Information, Renmin University of China, Beijing, China;Key Labs of Data Engineering and Knowledge Engineering, Beijing, China and School of Information, Renmin University of China, Beijing, China;Key Labs of Data Engineering and Knowledge Engineering, Beijing, China and School of Information, Renmin University of China, Beijing, China

  • Venue:
  • WAIM'10 Proceedings of the 11th international conference on Web-age information management
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Information Extraction (IE) aims at mining knowledge from unstructured data. Terminology extraction is one of crucial subtasks in IE. In this paper, we propose a novel approach of domain terminology extraction based on ranking, according to linkage of authors, papers and conferences in domain proceedings. Candidate terms are extracted by statistical methods and then ranked by the values of importance derived from mutual reinforcement result in the author-paper-conference graph. Furthermore, we integrate our approach with several classical termhood-based methods including C-value and inverse document frequency. The presented approach does not require any training data, and can be extended to other domains. Experimental results show that our approach outperforms several competitive methods.