Name entity recognition using inductive logic programming

  • Authors:
  • Huong Thanh Le;Thien Huu Nguyen

  • Affiliations:
  • Hanoi University of Technology, Hanoi, Vietnam;Hanoi University of Technology, Hanoi, Vietnam

  • Venue:
  • Proceedings of the 2010 Symposium on Information and Communication Technology
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Named entity recognition (NER) is the process of seeking to locate atomic elements in text into predefined categories such as the names of persons, organizations, locations, expressions of times, quantities, and percentages. It is useful in applying NER to other natural language tasks such as question-answering, text summarization, building semantic web, etc. This paper presents a system, called BKIE, that uses SRV -- an inductive logic program - to extract name entities in Vietnamese text. New predicates and features are added to SRV to deal with characteristics of Vietnamese language. Also, several strategies are proposed in this paper to improve the efficiency of the SRV algorithm. The data set using in experiments is 80 homepages of scientists in Vietnamese language that were tagged manually. The experiments give us the best F-score of 83% for extracting the "name" entity. It shows that SRV is an efficient NER algorithm given its advantages of generality and flexibility. In order to increase the system's performance, our future work includes (i) building a larger set of training data to improve system's performance; (ii) implementing BKIE using parallel programming to increase system efficiency; and (iii) testing BKIE with other application domains to get a more accurate evaluation of the system.