Bilingual chunk alignment based on interactional matching and probabilistic latent semantic indexing

  • Authors:
  • Feifan Liu;Qianli Jin;Jun Zhao;Bo Xu

  • Affiliations:
  • National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing;National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing;National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing;National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing

  • Venue:
  • IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

An integrated method for bilingual chunk partition andalignment, called “Interactional Matching”, is proposed in this paper. Different from former works, our method tries to get as necessary information as possible from the bilingual corpora themselves, and through bilingual constraint it can automatically build one-to-one chunk-pairs associated with the chunk-pair confidence coefficients. Also, our method partitions bilingual sentences entirely into chunks with no fragments left, different from collocation extracting methods. Furthermore, with the technology of Probabilistic Latent Semantic Indexing(PLSI), this method can deal with not only compositional chunks, but also non-compositional ones. The experiments show that, for overall process (including partition and alignment), our method can obtain 85% precision with 57% recall for the written language chunk-pairs and 78% precision with 53% recall for the spoken language chunk-pairs.