Knowledge-free discovery of domain-specific multiword units

  • Authors:
  • Axel-Cyrille Ngonga Ngomo

  • Affiliations:
  • Institute of Computer Sciences, Leipzig, Germany

  • Venue:
  • Proceedings of the 2008 ACM symposium on Applied computing
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

The discovery of multiword units is one of the key steps in the preprocessing of raw text. In this paper, we propose a know ledge-free approach for the discovery on such entities- It does not only outperform state-of-the-art approaches, but is also fully unsupervised. Furthermore, it does not demand the setting of any threshold, making it appropriate for usage by non-experts. The approach proposed is evaluated against five other metrics on a medical corpus.