SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Approximate nearest neighbors: towards removing the curse of dimensionality
STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
Proceedings of the 11th international conference on World Wide Web
Introduction to Data Mining, (First Edition)
Introduction to Data Mining, (First Edition)
Duplicate detection in adverse drug reaction surveillance
Data Mining and Knowledge Discovery
Hi-index | 0.00 |
For numerical record fields such as date and age, many types of error are likely to yield small numerical differences between observed and true values. If, for example, two different sources provide separate case reports related to the same incident, the dates of onset may not match perfectly but are more likely to differ by a few days than by several years. In order to tackle the variations in numbers a few methods are available. The paper proposes a new normalization technique useful for the numerical record. A Comparison of Distance with the Smith Waterman Distance shows significant increase in the weight by the present technique.