The english unknown term translation mining with improved bilingual snippets collection strategy

  • Authors:
  • Ying-Hong Liang;Jin-Xiang Li;Liang Ye;Ke Chen;Cui-Zhen Guo

  • Affiliations:
  • JiangSu Province Support Software Engineering R&D Center for Modern Information Technology Application in Enterprise, Suzhou, China,Suzhou Vocational University, Suzhou, China;Suzhou Vocational University, Suzhou, China;Suzhou Vocational University, Suzhou, China;Suzhou Vocational University, Suzhou, China;Suzhou Vocational University, Suzhou, China

  • Venue:
  • ICIC'12 Proceedings of the 8th international conference on Intelligent Computing Theories and Applications
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Traditional bilingual snippets retrieval method is to select the top N snippets returned from web search engine as bilingual corpora. In this paper, an improved bilingual snippets retrieval method is proposed. It combines term expansion, the surface pattern matching model and top N bilingual snippets selection to retrieve bilingual snippets. More relative bilingual snippets can be found by using term expansion, and surface pattern matching is useful to select bilingual snippets according to the constitution of unknown term translation. The top 100 bilingual snippets and those satisfy the surface pattern matching model serve as the last bilingual corpora for term translation mining. Experimental results show that the improved bilingual snippets retrieving method improves the top 100 inclusion rate by 2.3% than baseline system, which verified that the improved bilingual snippets retrieving method is effective.