Japanese named entity extraction evaluation: analysis of results

  • Authors:
  • Satoshi Sekine;Yoshio Eriguchi

  • Affiliations:
  • New York University, New York, NY;Research and Development Headquarters, NTT Data Corporation, Tokyo, Japan

  • Venue:
  • COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

We will report on one of the two tasks in the IREX (Information Retrieval and Extraction Exercise) project, an evaluation-based project for Information Retrieval and Information Extraction in Japanese (Sekine and Isahara, 2000) (IREX Committee, 1999). The project started in 1998 and concluded in September 1999 with many participants and collaborators (45 groups in total from Japan and the US). In this paper, the Named Entity (NE) task is reported. It is a task to extract NE's, such as names of organizations, persons, locations and artifacts, time expressions and numeric expressions from newspaper articles. First, we will explain the task and the definition, as well as the data we created and the results. Second, the analyses of the results will be described, which include analysis of task difficulty across the NE types and system types, analysis of domain dependency and comparison to human performance.