A bio-inspired application of natural language processing: A case study in extracting multiword expression

  • Authors:
  • Jianyong Duan;Ru Li;Yi Hu

  • Affiliations:
  • Department of Computer Science and Technology, College of Information Engineering, North China University of Technology, Shijingshan District, Beijing 100144, China;School of Computer and Information Technology, Shanxi University, Taiyuan 030006, China and Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education ...;Department of Computing, The Hong Kong Polytechnic University, Kowloo, Hong Kong

  • Venue:
  • Expert Systems with Applications: An International Journal
  • Year:
  • 2009

Quantified Score

Hi-index 12.05

Visualization

Abstract

For the multiword expression (MWE) extraction, the multiple sequence alignment (MSA) is proposed on the motivation of gene recognition. Because textual sequence is similar to gene sequence in pattern analysis. This MSA technique is combined with error-driven rules, with the improved efficiency beyond the traditional methods. It provides a guarantee for the MWE recall. It uses the dynamic programming method to prevent candidates from combinational explosion, and provides a global solution for pattern extraction instead of sub-pattern redundancy. Consequently, it has accurate measures for flexible patterns. In experiment, some advanced statistical measures are performed for ranking candidates. In the comparison experiment, the MSA approach achieved better results.