A compression algorithm using integrated record information for translation dictionaries

Authors:
Y. Kadoya;M. Fuketa;El-Sayed Atlam;K. Morita;T. Sumitomo;J. Aoe
Affiliations:
Department of Information Science and Intelligent Systems, University of Tokushima, 2-1 Minami Josanjima Cho, Tokushima-Shi 770-8506, Japan;Department of Information Science and Intelligent Systems, University of Tokushima, 2-1 Minami Josanjima Cho, Tokushima-Shi 770-8506, Japan;Department of Information Science and Intelligent Systems, University of Tokushima, 2-1 Minami Josanjima Cho, Tokushima-Shi 770-8506, Japan;Department of Information Science and Intelligent Systems, University of Tokushima, 2-1 Minami Josanjima Cho, Tokushima-Shi 770-8506, Japan;Department of Information Science and Intelligent Systems, University of Tokushima, 2-1 Minami Josanjima Cho, Tokushima-Shi 770-8506, Japan;Department of Information Science and Intelligent Systems, University of Tokushima, 2-1 Minami Josanjima Cho, Tokushima-Shi 770-8506, Japan
Venue:
Information Sciences—Informatics and Computer Science: An International Journal - Special issue: Informatics and computer science intelligent systems applications
Year:
2004

Citing 18
Cited 1

Algorithms for trie compaction

ACM Transactions on Database Systems (TODS)
Storing a Sparse Table with 0(1) Worst Case Access Time

Journal of the ACM (JACM)
Compilers: principles, techniques, and tools

Compilers: principles, techniques, and tools
Two Access Methods Using Compact Binary Trees

IEEE Transactions on Software Engineering
An Efficient Digital Search Algorithm by Using a Double-Array Structure

IEEE Transactions on Software Engineering
Trie Hashing with Controlled Load

IEEE Transactions on Software Engineering
An efficient implementation of trie structures

Software—Practice & Experience
The art of computer programming, volume 3: (2nd ed.) sorting and searching

The art of computer programming, volume 3: (2nd ed.) sorting and searching
Extendible hashing—a fast access method for dynamic files

ACM Transactions on Database Systems (TODS)
Dynamic hashing schemes

ACM Computing Surveys (CSUR)
Storing a sparse table

Communications of the ACM
Compressed tries

Communications of the ACM
Efficient string matching: an aid to bibliographic search

Communications of the ACM
Trie memory

Communications of the ACM
Computer Programs for Spelling Correction

Computer Programs for Spelling Correction
Data Structure Techniques

Data Structure Techniques
Data Structures and Algorithms

Data Structures and Algorithms
A Trie Compaction Algorithm for a Large Set of Keys

IEEE Transactions on Knowledge and Data Engineering

Identifying synonymous concepts in preparation for technology mining

Journal of Information Science

Quantified Score

Hi-index	0.00

Visualization

Abstract

A Trie structure is a well-known method for retrieving natural language (NL) dictionaries for morphological analysis, machine translation and so on. With the development of a variety of NL processing systems, some types of dictionaries in a computer hard disk have a lot of common information. This paper presents a method of merging individual dictionaries into the generalized dictionary. It enables us to reduce the total dictionary size and to expand the usage of individual dictionaries to that of the other applications. For key retrieval of the merged dictionary, there are many long strings such as compound words and idioms which take much space for a huge set of keys when stored in the Trie, so a fast trie structure, called a double-array structure is introduced and its compression scheme is proposed by replacing long strings into corresponding leaf node numbers of the Trie. Although the size of the presented records grows, the total number of them is extremely decreased by merging common information. The presented method is evaluated by the observation experimental results for nine dictionaries show that new method is more efficient than previous ones.