Convolution kernels with feature selection for natural language processing tasks

  • Authors:
  • Jun Suzuki;Hideki Isozaki;Eisaku Maeda

  • Affiliations:
  • NTT Communication Science Laboratories, NTT Corp., Seika-cho, Soraku-gun, Kyoto, Japan;NTT Communication Science Laboratories, NTT Corp., Seika-cho, Soraku-gun, Kyoto, Japan;NTT Communication Science Laboratories, NTT Corp., Seika-cho, Soraku-gun, Kyoto, Japan

  • Venue:
  • ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Convolution kernels, such as sequence and tree kernels, are advantageous for both the concept and accuracy of many natural language processing (NLP) tasks. Experiments have, however, shown that the over-fitting problem often arises when these kernels are used in NLP tasks. This paper discusses this issue of convolution kernels, and then proposes a new approach based on statistical feature selection that avoids this issue. To enable the proposed method to be executed efficiently, it is embedded into an original kernel calculation process by using sub-structure mining algorithms. Experiments are undertaken on real NLP tasks to confirm the problem with a conventional method and to compare its performance with that of the proposed method.