Pattern matching and discourse processing in information extraction from Japanese text

Authors:
Tsuyoshi Kitani;Yoshio Eriguchi;Masami Hara
Affiliations:
Center for Machine Translation, Carnegie Mellon University, Pittsburgh, PA;NTT Data Communications Systems Corp., Kawasaki-shi, Kanagawa, Japan;NTT Data Communications Systems Corp., Kawasaki-shi, Kanagawa, Japan
Venue:
Journal of Artificial Intelligence Research
Year:
1994

Citing 14
Cited 1

Finite-state approximations of grammars

HLT '90 Proceedings of the workshop on Speech and Natural Language
Creating segmented databases from free text for text retrieval

SIGIR '91 Proceedings of the 14th annual international ACM SIGIR conference on Research and development in information retrieval
An evaluation of text analysis technologies

AI Magazine
A comparison of indexing techniques for Japanese text retrieval

SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
The String-to-String Correction Problem

Journal of the ACM (JACM)
Reference resolution using semantic patterns in Japanese newspaper articles

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
Comparing human and machine performance for natural language information extraction: results for English microelectronics from the MUC-5 evaluation

MUC5 '93 Proceedings of the 5th conference on Message understanding
MUC-5 evaluation metrics

MUC5 '93 Proceedings of the 5th conference on Message understanding
BBN: description of the PLUM system as used for MUC-5

MUC5 '93 Proceedings of the 5th conference on Message understanding
GE-CMU: description of the SHOGUN system used for MUC-5

MUC5 '93 Proceedings of the 5th conference on Message understanding
NEC: description of the VENIEX system as used for MUC-5

MUC5 '93 Proceedings of the 5th conference on Message understanding
CRL/Brandeis: description of the Diderot system as used for MUC-5

MUC5 '93 Proceedings of the 5th conference on Message understanding
UMass/Hughes: description of the CIRCUS system used for MUC-5

MUC5 '93 Proceedings of the 5th conference on Message understanding
An adjunct test for discourse processing in MUC-4

MUC4 '92 Proceedings of the 4th conference on Message understanding

Using temporal cues for segmenting texts into events

IceTAL'10 Proceedings of the 7th international conference on Advances in natural language processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Information extraction is the task of automatically picking up information of interest from an unconstrained text. Information of interest is usually extracted in two steps. First, sentence level processing locates relevant pieces of information scattered throughout the text; second, discourse processing merges coreferential information to generate the output. In the first step, pieces of information are locally identified without recognizing any relationships among them. A key word search or simple pattern search can achieve this purpose. The second step requires deeper knowledge in order to understand relationships among separately identified pieces of information. Previous information extraction systems focused on the first step, partly because they were not required to link up each piece of information with other pieces. To link the extracted pieces of information and map them onto a structured output format, complex discourse processing is essential. This paper reports on a Japanese information extraction system that merges information using a pattern matcher and discourse processor. Evaluation results show a high level of system performance which approaches human performance.