Latent Topic Extraction from Relational Table for Record Matching
DS '09 Proceedings of the 12th International Conference on Discovery Science
Author name disambiguation for citations on the deep web
WAIM'10 Proceedings of the 2010 international conference on Web-age information management
Entity disambiguation with hierarchical topic models
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Collective context-aware topic models for entity disambiguation
Proceedings of the 21st international conference on World Wide Web
Active associative sampling for author name disambiguation
Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries
A brief survey of automatic methods for author name disambiguation
ACM SIGMOD Record
Entity Disambiguation with Freebase
WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01
Hi-index | 0.00 |
In bibliographies like DBLP and Citeseer, there are three kinds of entity-name problems that need to be solved. First, multiple entities share one name, which is called the name sharing problem. Second, one entity has different names, which is called the name variant problem. Third, multiple entities share multiple names, which is called the name mixing problem. We aim to solve these problems based on one model in this paper. We call this task complete entity resolution. Different from previous work, our work use global information based on data with two types of information, words and author names. We propose a generative latent topic model that involves both author names and words — the LDA-dual model, by extending the LDA (Latent Dirichlet Allocation) model. We also propose a method to obtain model parameters that is global information. Based on obtained model parameters, we propose two algorithms to solve the three problems mentioned above. Experimental results demonstrate the effectiveness and great potential of the proposed model and algorithms.