Probabilistic reasoning in intelligent systems: networks of plausible inference
Probabilistic reasoning in intelligent systems: networks of plausible inference
Applied multivariate statistical analysis
Applied multivariate statistical analysis
Automatic text processing: the transformation, analysis, and retrieval of information by computer
Automatic text processing: the transformation, analysis, and retrieval of information by computer
The EM algorithm for graphical association models with missing data
Computational Statistics & Data Analysis - Special issue dedicated to Toma´sˇ Havra´nek
Categorization as probability density estimation
Journal of Mathematical Psychology
The impact of poor data quality on the typical enterprise
Communications of the ACM
Text Classification from Labeled and Unlabeled Documents using EM
Machine Learning - Special issue on information retrieval
A Distance-Based Approach to Entity Reconciliation in Heterogeneous Databases
IEEE Transactions on Knowledge and Data Engineering
Unsupervised Learning with Mixed Numeric and Nominal Data
IEEE Transactions on Knowledge and Data Engineering
Clustering for Approximate Similarity Search in High-Dimensional Spaces
IEEE Transactions on Knowledge and Data Engineering
Data association methods with applications to law enforcement
Decision Support Systems
Automatically detecting deceptive criminal identities
Communications of the ACM - Homeland security
Hierarchical Latent Class Models for Cluster Analysis
The Journal of Machine Learning Research
Classification using Hierarchical Naïve Bayes models
Machine Learning
Adaptive Name Matching in Information Integration
IEEE Intelligent Systems
Secure and useful data sharing
Decision Support Systems
Entity matching in heterogeneous databases: A logistic regression approach
Decision Support Systems
Fighting cybercrime: a review and the Taiwan experience
Decision Support Systems - Special issue: Intelligence and security informatics
Latent variable discovery in classification models
Artificial Intelligence in Medicine
Hi-index | 0.00 |
Organizations often manage identity information for their customers, vendors, and employees. Identity management is critical to various organizational practices ranging from customer relationship management to crime investigation. The task of searching for a specific identity is difficult because disparate identity information may exist due to the issues related to unintentional errors and intentional deception. In this paper we propose a hierarchical Naive Bayes model that improves existing identity matching techniques in terms of searching effectiveness. Experiments show that our proposed model performs significantly better than the exact-match based matching technique. With 50% training instances labeled, the proposed semi-supervised learning achieves a performance comparable to the fully supervised record comparison algorithm. The semi-supervised learning greatly reduces the efforts of manually labeling training instances without significant performance degradation.