The Journal of Machine Learning Research
Adaptive duplicate detection using learnable string similarity measures
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Group and topic discovery from relations and text
Proceedings of the 3rd international workshop on Link discovery
Efficient topic-based unsupervised name disambiguation
Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
A Latent Topic Model for Complete Entity Resolution
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Extracting key phrases to disambiguate personal name queries in web search
CLIIR '06 Proceedings of the Workshop on How Can Computational Linguistics Improve Information Retrieval?
Cross-lingual keyword recommendation using latent topics
Proceedings of the 1st International Workshop on Information Heterogeneity and Fusion in Recommender Systems
Hi-index | 0.00 |
We propose a latent feature extraction method for record linkage. We first introduce a probabilistic model that generates records with their latent topics. The proposed generative model is designed to utilize the co-occurrence among the attributes of the record. Then, we derive a topic estimation algorithm using the Gibbs sampling technique. The estimated topics are used to identify records. The proposed algorithm works in an unsupervised way; i.e., we do not need to prepare labor-intensive training data. We evaluated the proposed model using bibliographic records and proved that the proposed method tended to perform better for records with more attributes by utilizing their co-occurrence.