A survey of approaches to automatic schema matching
The VLDB Journal — The International Journal on Very Large Data Bases
Discovering personal gazetteers: an interactive clustering approach
Proceedings of the 12th annual ACM international workshop on Geographic information systems
The Google Similarity Distance
IEEE Transactions on Knowledge and Data Engineering
Content-based ontology matching for GIS datasets
Proceedings of the 16th ACM SIGSPATIAL international conference on Advances in geographic information systems
Validating Multi-column Schema Matchings by Type
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Hi-index | 0.00 |
Resolving semantic heterogeneity across distinct data sources remains a highly relevant problem in the GIS domain requiring innovative solutions. Our approach, called GSim, semantically aligns tables from respective GIS databases by first choosing attributes for comparison. We then examine their instances and calculate a similarity value between them called entropy-based distribution (EBD) by combining two separate methods. Our primary method discerns the geographic types from instances of compared attributes. If geographic type matching is not possible, we then apply a generic schema matching method which employs normalized Google distance. We show the effectiveness of our approach over the traditional N-gram approach across multi-jurisdictional datasets by generating impressive results.