A supervised learning approach to search of definitions

Authors:
Jun Xu;Yun-Bo Cao;Hang Li;Min Zhao;Ya-Lou Huang
Affiliations:
College of Software, Nankai University, Tianjin, P.R. China;Microsoft Research Asia, Beijing, P.R. China;Microsoft Research Asia, Beijing, P.R. China;Institute of Automation, Chinese Academy of Sciences, Beijing, P.R. China;College of Software, Nankai University, Tianjin, P.R. China
Venue:
Journal of Computer Science and Technology - Special section on China AVS standard
Year:
2006

Citing 17
Cited 1

Experiments with a component theory of probabilistic information retrieval based on single terms as document components

ACM Transactions on Information Systems (TOIS)
A network approach to probabilistic information retrieval

ACM Transactions on Information Systems (TOIS)
The nature of statistical learning theory

The nature of statistical learning theory
Learning search engine specific query transformations for question answering

Proceedings of the 10th international conference on World Wide Web
Evaluation of DEFINDER: a system to mine definitions from consumer-oriented medical text

Proceedings of the 1st ACM/IEEE-CS joint conference on Digital libraries
Getting answers to natural language questions on the web

Journal of the American Society for Information Science and Technology
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
Optimizing search engines using clickthrough data

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining topic-specific concepts and definitions on the web

WWW '03 Proceedings of the 12th international conference on World Wide Web
DefScriber: a hybrid system for definitional QA

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Unsupervised learning of soft patterns for generating definitions from online news

Proceedings of the 13th international conference on World Wide Web
A new strategy for providing definitions in task-oriented dialogues

COLING '88 Proceedings of the 12th conference on Computational linguistics - Volume 2
Evaluation of an extraction-based approach to answering definitional questions

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Probabilistic question answering on the Web: Research Articles

Journal of the American Society for Information Science and Technology
Evaluating answers to definition questions

NAACL-Short '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: companion volume of the Proceedings of HLT-NAACL 2003--short papers - Volume 2
A unified statistical model for the identification of English baseNP

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Genetic Programming-Based Discovery of Ranking Functions for Effective Web Search

Journal of Management Information Systems

Automatically ranking reviews based on the ordinal regression model

AICI'11 Proceedings of the Third international conference on Artificial intelligence and computational intelligence - Volume Part III

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper addresses the issue of search of definitions. Specifically, for a given term, we are to find out its definition candidates and rank the candidates according to their likelihood of being good definitions. This is in contrast to the traditional methods of either generating a single combined definition or outputting all retrieved definitions. Definition ranking is essential for tasks. A specification for judging the goodness of a definition is given. In the specification, a definition is categorized into one of the three levels: good definition, indifferent definition, or bad definition. Methods of performing definition ranking are also proposed in this paper, which formalize the problem as either classification or ordinal regression. We employ SVM (Support Vector Machines) as the classification model and Ranking SVM as the ordinal regression model respectively, and thus they rank definition candidates according to their likelihood of being good definitions. Features for constructing the SVM and Ranking SVM models are defined, which represent the characteristics of terms, definition candidate, and their relationship. Experimental results indicate that the use of SVM and Ranking SVM can significantly outperform the baseline methods such as heuristic rules, the conventional information retrieval--Okapi, or SVM regression. This is true when both the answers are paragraphs and they are sentences. Experimental results also show that SVM or Ranking SVM models trained in one domain can be adapted to another domain, indicating that generic models for definition ranking can be constructed.