Compute the Term Contributed Frequency

  • Authors:
  • Cheng-Lung Sung;Hsu-Chun Yen;Wen-Lian Hsu

  • Affiliations:
  • -;-;-

  • Venue:
  • ISDA '08 Proceedings of the 2008 Eighth International Conference on Intelligent Systems Design and Applications - Volume 02
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we propose an algorithm and data structure for computing the term contributed frequency (tcf) for all N-grams in a text corpus. Although term frequency is one of the standard notions of frequency in corpus-based natural language processing (NLP), there are some problems regarding the use of the concept to N-grams approaches such as the distortion of phrase frequencies. We attempt to overcome this drawback by building a DAG containing the proposed data structure and using it to retrieve more reliable term frequencies. Our proposed algorithm and data structure are more efficient than traditional term frequency extraction approaches and portable to various languages.