Bootstrapping-Based Extraction of Dictionary Terms from Unsegmented Legal Text

Authors:
Masato Hagiwara;Yasuhiro Ogawa;Katsuhiko Toyama
Affiliations:
Graduate School of Information Science, Nagoya University, Nagoya, Japan 464-8603;Graduate School of Information Science, Nagoya University, Nagoya, Japan 464-8603;Graduate School of Information Science, Nagoya University, Nagoya, Japan 464-8603
Venue:
New Frontiers in Artificial Intelligence
Year:
2009

Citing 8
Cited 1

Learning dictionaries for information extraction by multi-level bootstrapping

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Selecting indexing strings using adaptation

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Automatic acquisition of hyponyms from large text corpora

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Improved statistical alignment models

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
A bootstrapping method for learning semantic lexicons using extraction pattern contexts

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Espresso: leveraging generic patterns for automatically harvesting semantic relations

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Application of Word Alignment for Supporting Translation of Japanese Statutes into English

Proceedings of the 2006 conference on Legal Knowledge and Information Systems: JURIX 2006: The Nineteenth Annual Conference
Identifying synonyms among distributionally similar words

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence

Design and compilation of syntactically tagged corpus of japanese statutory sentences

JSAI-isAI'10 Proceedings of the 2010 international conference on New Frontiers in Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recent demands for translating Japanese statutes into foreign languages necessitate the compilation of standard bilingual dictionaries. To support this costly task, we propose a bootstrapping-based lexical knowledge extraction algorithm Monaka , to automatically extract dictionary term candidates from unsegmented Japanese legal text. The algorithm is based on the Tchai algorithm and extracts reliable patterns and instances in an iterative manner, but instead uses character n -grams as contextual patterns, and introduces a special constraint to ensure proper segmentation of the extracted terms. The experimental results show that this algorithm can extract correctly segmented and important dictionary terms with higher accuracy compared to conventional methods.