A Graph Partitioning Approach to Entity Disambiguation Using Uncertain Information

Authors:
Emili Sapena;Lluís Padró;Jordi Turmo
Affiliations:
TALP Research Center, Universitat Politecnica de Catalunya, Barcelona, Spain;TALP Research Center, Universitat Politecnica de Catalunya, Barcelona, Spain;TALP Research Center, Universitat Politecnica de Catalunya, Barcelona, Spain
Venue:
GoTAL '08 Proceedings of the 6th international conference on Advances in Natural Language Processing
Year:
2008

Citing 16
Cited 0

An evolutionary approach to training relaxation labeling processes

Pattern Recognition Letters
The merge/purge problem for large databases

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
A Machine Learning Approach to POS Tagging

Machine Learning
Efficient clustering of high-dimensional data sets with application to reference matching

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Genetic Algorithms in Search, Optimization and Machine Learning

Genetic Algorithms in Search, Optimization and Machine Learning
Adaptive duplicate detection using learnable string similarity measures

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Two supervised learning approaches for name disambiguation in author citations

Proceedings of the 4th ACM/IEEE-CS joint conference on Digital libraries
Iterative record linkage for cleaning and integration

Proceedings of the 9th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
Name disambiguation in author citations using a K-way spectral clustering method

Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries
Joint deduplication of multiple record types in relational data

Proceedings of the 14th ACM international conference on Information and knowledge management
Profile-Based Object Matching for Information Integration

IEEE Intelligent Systems
Entity Resolution with Markov Logic

ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Collective entity resolution in relational data

ACM Transactions on Knowledge Discovery from Data (TKDD)
Adaptive graphical approach to entity resolution

Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
Constraint-based entity matching

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
A constrained clustering approach to duplicate detection among relational data

PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a method for Entity Disambiguation in Information Extraction from different sources in the web. Once entities and relations between them are extracted, it is needed to determine which ones are referring to the same real-world entity. We model the problem as a graph partitioning problem in order to combine the available information more accurately than a pairwise classifier. Moreover, our method handle uncertain information which turns out to be quite helpful. Two algorithms are trained and compared, one probabilistic and the other deterministic. Both are tuned using genetic algorithms to find the best weights for the set of constraints. Experiments show that graph-based modeling yields better results using uncertain information.