Total rank distance and scaled total rank distance: two alternative metrics in computational linguistics

  • Authors:
  • Anca Dinu;Liviu P. Dinu

  • Affiliations:
  • University of Bucharest, Bucharest, Romania;University of Bucharest, Bucharest, Romania

  • Venue:
  • LD '06 Proceedings of the Workshop on Linguistic Distances
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we propose two metrics to be used in various fields of computational linguistics area. Our construction is based on the supposition that in most of the natural languages the most important information is carried by the first part of the unit. We introduce total rank distance and scaled total rank distance, we prove that they are metrics and investigate their max and expected values. Finally, a short application is presented: we investigate the similarity of Romance languages by computing the scaled total rank distance between the digram rankings of each language.