Data structures & program design
Data structures & program design
Algorithms (2nd ed.)
Information retrieval
The art of computer programming, volume 3: (2nd ed.) sorting and searching
The art of computer programming, volume 3: (2nd ed.) sorting and searching
File Structures
Performance in Practice of String Hashing Functions
Proceedings of the Fifth International Conference on Database Systems for Advanced Applications (DASFAA)
Hi-index | 0.00 |
This paper presents the results of investigating the impact of variations found in character coding schemes on the performance of string hashing. The investigation involved three types of Arabic strings (single words, personal names, and document titles) and four different Arabic coding schemes. The results were examined in three different respects: collision rates, arithmetic code redundancy, and the contribution of arithmetic redundancy to the collision rate. Two items are considered arithmetically redundant, if they have the same numerical coding value. Even though the mathematical properties of coding schemes showed some impact on the hashing results, coding scheme variation was basically reflected in the results of hashing on single dictionary words. Where a difference was noted in the rates of arithmetic redundancy, it was accompanied by different growth patterns of collision. The results seem to indicate that the arithmetic properties of the collating sequence of a given coding scheme are likely to have some impact on the performance of string hashing.