Name entity recognition using inductive logic programming

Authors:
Huong Thanh Le;Thien Huu Nguyen
Affiliations:
Hanoi University of Technology, Hanoi, Vietnam;Hanoi University of Technology, Hanoi, Vietnam
Venue:
Proceedings of the 2010 Symposium on Information and Communication Technology
Year:
2010

Citing 6
Cited 3

Information extraction from HTML: application of a general machine learning approach

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Learning Information Extraction Rules for Semi-Structured and Free Text

Machine Learning - Special issue on natural language learning
Relational learning of pattern-match rules for information extraction

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
FOIL: A Midterm Report

ECML '93 Proceedings of the European Conference on Machine Learning
Learning information extraction patterns from examples

Connectionist, Statistical, and Symbolic Approaches to Learning for Natural Language Processing
Toward general-purpose learning for information extraction

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1

Automatic identification of protagonist in fairy tales using verb

PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
VAHA: verbs associate with human activity --- a study on fairy tales

IEA/AIE'12 Proceedings of the 25th international conference on Industrial Engineering and Other Applications of Applied Intelligent Systems: advanced research in applied artificial intelligence
Automatic dominant character identification in fables based on verb analysis - Empirical study on the impact of anaphora resolution

Knowledge-Based Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Named entity recognition (NER) is the process of seeking to locate atomic elements in text into predefined categories such as the names of persons, organizations, locations, expressions of times, quantities, and percentages. It is useful in applying NER to other natural language tasks such as question-answering, text summarization, building semantic web, etc. This paper presents a system, called BKIE, that uses SRV -- an inductive logic program - to extract name entities in Vietnamese text. New predicates and features are added to SRV to deal with characteristics of Vietnamese language. Also, several strategies are proposed in this paper to improve the efficiency of the SRV algorithm. The data set using in experiments is 80 homepages of scientists in Vietnamese language that were tagged manually. The experiments give us the best F-score of 83% for extracting the "name" entity. It shows that SRV is an efficient NER algorithm given its advantages of generality and flexibility. In order to increase the system's performance, our future work includes (i) building a larger set of training data to improve system's performance; (ii) implementing BKIE using parallel programming to increase system efficiency; and (iii) testing BKIE with other application domains to get a more accurate evaluation of the system.