SEERLAB: A system for extracting key phrases from scholarly documents

  • Authors:
  • Pucktada Treeratpituk;Pradeep Teregowda;Jian Huang;C. Lee Giles

  • Affiliations:
  • Pennsylvania State University, University Park, PA;Pennsylvania State University, University Park, PA;Pennsylvania State University, University Park, PA;Pennsylvania State University, University Park, PA

  • Venue:
  • SemEval '10 Proceedings of the 5th International Workshop on Semantic Evaluation
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

We describe the SEERLAB system that participated in the SemEval 2010's Keyphrase Extraction Task. SEERLAB utilizes the DBLP corpus for generating a set of candidate keyphrases from a document. Random Forest, a supervised ensemble classifier, is then used to select the top keyphrases from the candidate set. SEERLAB achieved a 0.24 F-score in generating the top 15 keyphrases, which places it sixth among 19 participating systems. Additionally, SEERLAB performed particularly well in generating the top 5 keyphrases with an F-score that ranked third.