A method for multilingual text mining and retrieval using growing hierarchical self-organizing maps

  • Authors:
  • Hsin-Chang Yang;Chung-Hong Lee;Ding-Wen Chen

  • Affiliations:
  • Department of Information Management, National Universityof Kaohsiung, Kaohsiung, Taiwan;Department of Electrical Engineering, National KaohsiungUniversity of Applied Sciences, Kaohsiung, Taiwan;Department of Information Management, Chang Jung ChristianUniversity, Tainan, Taiwan

  • Venue:
  • Journal of Information Science
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

With the increasing number of multilingual texts in theinternet, multilingual text retrieval techniques have become animportant research issue. However, the discovery of relationshipsbetween different languages remains an open problem. In this paperwe propose a method, which applies the growing hierarchicalself-organizing map (GHSOM) model, to discover knowledge frommultilingual text documents. Multilingual parallel corpora weretrained by the GHSOM to generate hierarchical feature maps. Adiscovery process is then applied on these maps to discover therelationships between documents of different languages. Therelationships between keywords of different languages are alsorevealed. We conducted experiments on a set of Chinese-Englishbilingual parallel corpora to discover the relationships betweendocuments of these languages. We also use such relationships toperform multilingual information retrieval tasks. The experimentalresults show that our multilingual text mining approach may captureconceptual relationships among documents as well as keywordswritten in different languages.