The impact of poor data quality on the typical enterprise
Communications of the ACM
Text Classification from Labeled and Unlabeled Documents using EM
Machine Learning - Special issue on information retrieval
Automatically detecting deceptive criminal identities
Communications of the ACM - Homeland security
A hierarchical graphical model for record linkage
UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Discovering identity problems: a case study
ISI'05 Proceedings of the 2005 IEEE international conference on Intelligence and Security Informatics
The Arizona IDMatcher: developing an identity matching tool for law enforcement
dg.o '07 Proceedings of the 8th annual international conference on Digital government research: bridging disciplines & domains
Identity matching using personal and social identity features
Information Systems Frontiers
Hi-index | 0.00 |
Identity management is critical to various governmental practices ranging from providing citizens services to enforcing homeland security. The task of searching for a specific identity is difficult because multiple identity representations may exist due to issues related to unintentional errors and intentional deception. We propose a Naïve Bayes identity matching model that improves existing techniques in terms of effectiveness. Experiments show that our proposed model performs significantly better than the exact-match based technique and achieves higher precision than the record comparison technique. In addition, our model greatly reduces the efforts of manually labeling training instances by employing a semi-supervised learning approach. This training method outperforms both fully supervised and unsupervised learning. With a training dataset that only contains 30% labeled instances, our model achieves a performance comparable to that of a fully supervised learning.