Studying feature generation from various data representations for answer extraction

  • Authors:
  • Dan Shen;Geert-Jan M. Kruijff;Dietrich Klakow

  • Affiliations:
  • Saarland University, Postfach, Saarbruecken, Germany;Saarland University, Postfach, Saarbruecken, Germany;Saarland University, Postfach, Saarbruecken, Germany

  • Venue:
  • FeatureEng '05 Proceedings of the ACL Workshop on Feature Engineering for Machine Learning in Natural Language Processing
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we study how to generate features from various data representations, such as surface texts and parse trees, for answer extraction. Besides the features generated from the surface texts, we mainly discuss the feature generation in the parse trees. We propose and compare three methods, including feature vector, string kernel and tree kernel, to represent the syntactic features in Support Vector Machines. The experiment on the TREC question answering task shows that the features generated from the more structured data representations significantly improve the performance based on the features generated from the surface texts. Furthermore, the contribution of the individual feature will be discussed in detail.