A supervised learning approach to search of definitions

  • Authors:
  • Jun Xu;Yun-Bo Cao;Hang Li;Min Zhao;Ya-Lou Huang

  • Affiliations:
  • College of Software, Nankai University, Tianjin, P.R. China;Microsoft Research Asia, Beijing, P.R. China;Microsoft Research Asia, Beijing, P.R. China;Institute of Automation, Chinese Academy of Sciences, Beijing, P.R. China;College of Software, Nankai University, Tianjin, P.R. China

  • Venue:
  • Journal of Computer Science and Technology - Special section on China AVS standard
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper addresses the issue of search of definitions. Specifically, for a given term, we are to find out its definition candidates and rank the candidates according to their likelihood of being good definitions. This is in contrast to the traditional methods of either generating a single combined definition or outputting all retrieved definitions. Definition ranking is essential for tasks. A specification for judging the goodness of a definition is given. In the specification, a definition is categorized into one of the three levels: good definition, indifferent definition, or bad definition. Methods of performing definition ranking are also proposed in this paper, which formalize the problem as either classification or ordinal regression. We employ SVM (Support Vector Machines) as the classification model and Ranking SVM as the ordinal regression model respectively, and thus they rank definition candidates according to their likelihood of being good definitions. Features for constructing the SVM and Ranking SVM models are defined, which represent the characteristics of terms, definition candidate, and their relationship. Experimental results indicate that the use of SVM and Ranking SVM can significantly outperform the baseline methods such as heuristic rules, the conventional information retrieval--Okapi, or SVM regression. This is true when both the answers are paragraphs and they are sentences. Experimental results also show that SVM or Ranking SVM models trained in one domain can be adapted to another domain, indicating that generic models for definition ranking can be constructed.