Representing a bilingual lexicon with suffix trees

  • Authors:
  • Jorge Costa;Gabriel Pereira Lopes;Luís Gomes;Luís M. S. Russo

  • Affiliations:
  • Caparica, Portugal;Caparica, Portugal;Caparica, Portugal;Caparica, Portugal

  • Venue:
  • Proceedings of the 2011 ACM Symposium on Applied Computing
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a system based on generalized suffix trees that efficiently implements a set of operations over a bilingual lexicon. Besides the basic operations of adding and removing translations from the lexicon, the system provides two unique query functions that we refer to as monolingual and bilingual coverage. These two functions lay the foundation for higher-level mining operations, such as identification of translation patterns, that are the subject of ongoing research. Nevertheless, the system presented here is interesting in and by itself, for the novelty of the coverage functions and the potential of the whole data structure. We compare the performance of two implementations, one based on suffix trees and the other on suffix arrays.