Multilingual and Multimedia Information Retrieval from Web Documents

  • Authors:
  • Marta Gatius;Manuel Bertran;Horacio Rodriguez

  • Affiliations:
  • Technical University of Catalunya, Barcelona;Technical University of Catalunya, Barcelona;Technical University of Catalunya, Barcelona

  • Venue:
  • DEXA '04 Proceedings of the Database and Expert Systems Applications, 15th International Workshop
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Web documents present new challenges to conventional Information Retrieval (IR) technologies. This paper describes how these challenges are faced in FameIR, a multilingual multimedia IR shell. In this shell Cross-Language IR (CLIR) and query expansion are performed using EuroWordNet (EWN), the best developed and most widely used lexical resource for several languages. Techniques to extract information from Web documents, Wrapper Generation (WG) techniques, are used to access a finer information granularity than the whole Web page. By combining IR and WG techniques with the use of EWN, FameIR provides a powerful facility to perform CLIR from multimedia Web documents.