World Wide Web - A Multilingual Language Resource

  • Authors:
  • Fang Li;Huanye Sheng;Wilhelm Weisweber

  • Affiliations:
  • -;-;-

  • Venue:
  • WI '01 Proceedings of the First Asia-Pacific Conference on Web Intelligence: Research and Development
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper argues that the World Wide Web could be regarded not only as an information resource but also as a dynamic, multilingual, least controlled, easy to access and untagged language corpus. In order to support this idea, we realized a method, which is able to extract bilingual lexicons from parallel WWW pages by two-stage alignment. Language pairs of German, English and Chinese have been selected but the realization is independent of any natural language, domain or markup.