Introduction to algorithms
Compression of correlated bit-vectors
Information Systems
Subtopic structuring for full-length document access
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Text algorithms
Data mining
Clumping properties of content-bearing words
Journal of the American Society for Information Science
Semantic Road Maps for Literature Searchers
Journal of the ACM (JACM)
Multi-paragraph segmentation of expository text
ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Coding and Information Theory
Information Retrieval
Abnormal behaviours identification for an elder's life activities using dissimilarity measurements
Proceedings of the 4th International Conference on PErvasive Technologies Related to Assistive Environments
FQAS'06 Proceedings of the 7th international conference on Flexible Query Answering Systems
Hi-index | 0.00 |
Many problems depend on a reliable measure of the distance or similarity between objects that, frequently, are represented as vectors. We consider here vectors that can be expressed as bit sequences. For such problems, the most heavily used measure is the Hamming distance, perhaps normalized. The value of Hamming distances is limited by the fact that it counts only exact matches, whereas in various applications, corresponding bits that are close by, but not exactly matched, can still be considered to be almost identical. We here define a "fuzzy Hamming distance" that extends the Hamming concept to give partial credit for near misses, and suggest a dynamic programming algorithm that permits it to be computed efficiently. We envision many uses for such a measure.