Fuzzy Hamming Distance: A New Dissimilarity Measure

Authors:
Abraham Bookstein;Shmuel T. Klein;Timo Raita
Affiliations:
-;-;-
Venue:
CPM '01 Proceedings of the 12th Annual Symposium on Combinatorial Pattern Matching
Year:
2001

Citing 9
Cited 3

Introduction to algorithms

Introduction to algorithms
Compression of correlated bit-vectors

Information Systems
Subtopic structuring for full-length document access

SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Text algorithms

Text algorithms
Data mining

Data mining
Clumping properties of content-bearing words

Journal of the American Society for Information Science
Semantic Road Maps for Literature Searchers

Journal of the ACM (JACM)
Multi-paragraph segmentation of expository text

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Coding and Information Theory

Coding and Information Theory

Generalized Hamming Distance

Information Retrieval
Abnormal behaviours identification for an elder's life activities using dissimilarity measurements

Proceedings of the 4th International Conference on PErvasive Technologies Related to Assistive Environments
XML fuzzy ranking

FQAS'06 Proceedings of the 7th international conference on Flexible Query Answering Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many problems depend on a reliable measure of the distance or similarity between objects that, frequently, are represented as vectors. We consider here vectors that can be expressed as bit sequences. For such problems, the most heavily used measure is the Hamming distance, perhaps normalized. The value of Hamming distances is limited by the fact that it counts only exact matches, whereas in various applications, corresponding bits that are close by, but not exactly matched, can still be considered to be almost identical. We here define a "fuzzy Hamming distance" that extends the Hamming concept to give partial credit for near misses, and suggest a dynamic programming algorithm that permits it to be computed efficiently. We envision many uses for such a measure.