Named entity recognition in a South African context

  • Authors:
  • Anita Louis;Alta De Waal;Cobus Venter

  • Affiliations:
  • Council for Scientific and Industrial Research (CSIR), Pretoria, South Africa;Council for Scientific and Industrial Research (CSIR), Pretoria, South Africa;Council for Scientific and Industrial Research (CSIR), Pretoria, South Africa

  • Venue:
  • SAICSIT '06 Proceedings of the 2006 annual research conference of the South African institute of computer scientists and information technologists on IT research in developing countries
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

The feasibility of a probabilistic Named Entity Recognition system in a South African context was tested. The intended use of the system is in a cyber forensic domain. At the core of the system is a dynamic Bayesian Network, which takes into account the probabilistic relationship between variables as well as contextual information. We illustrate the performance of such a system using different probability thresholds for classification purposes and compare the performance with and without a name gazetteer. Our system compares competently with similar existing systems in the information extraction domain. Future work will involve the application of the system in the cyber forensic environment, which poses new challenges such as diverse text types.