Learning dictionaries for information extraction by multi-level bootstrapping
AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Selecting indexing strings using adaptation
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Automatic acquisition of hyponyms from large text corpora
COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Improved statistical alignment models
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
A bootstrapping method for learning semantic lexicons using extraction pattern contexts
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Espresso: leveraging generic patterns for automatically harvesting semantic relations
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Application of Word Alignment for Supporting Translation of Japanese Statutes into English
Proceedings of the 2006 conference on Legal Knowledge and Information Systems: JURIX 2006: The Nineteenth Annual Conference
Identifying synonyms among distributionally similar words
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Design and compilation of syntactically tagged corpus of japanese statutory sentences
JSAI-isAI'10 Proceedings of the 2010 international conference on New Frontiers in Artificial Intelligence
Hi-index | 0.00 |
Recent demands for translating Japanese statutes into foreign languages necessitate the compilation of standard bilingual dictionaries. To support this costly task, we propose a bootstrapping-based lexical knowledge extraction algorithm Monaka , to automatically extract dictionary term candidates from unsegmented Japanese legal text. The algorithm is based on the Tchai algorithm and extracts reliable patterns and instances in an iterative manner, but instead uses character n -grams as contextual patterns, and introduces a special constraint to ensure proper segmentation of the extracted terms. The experimental results show that this algorithm can extract correctly segmented and important dictionary terms with higher accuracy compared to conventional methods.