A unifying semantic distance model for determining the similarity of attribute values

Authors:
John F. Roddick;Kathleen Hornsby;Denise de Vries
Affiliations:
School of Informatics and Engineering, Flinders University of South Australia, PO Box 2100, Adelaide 5001, South Australia;National Centre for Geographic Information and Analysis, University of Maine, Orono, Maine;School of Informatics and Engineering, Flinders University of South Australia, PO Box 2100, Adelaide 5001, South Australia
Venue:
ACSC '03 Proceedings of the 26th Australasian computer science conference - Volume 16
Year:
2003

Citing 5
Cited 13

Association rules over interval data

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Maintaining knowledge about temporal intervals

Communications of the ACM
Conceptual Graph Standard and Extension

ICCS '98 Proceedings of the 6th International Conference on Conceptual Structures: Theory, Tools and Applications
Asessing Semnatic Similarities among Geospatial Feature Class Definitions

INTEROP '99 Proceedings of the Second International Conference on Interoperating Geographic Information Systems
Shifts in Detail through Temporal Zooming

DEXA '99 Proceedings of the 10th International Workshop on Database & Expert Systems Applications

Exact functional context matching for web services

Proceedings of the 2nd international conference on Service oriented computing
The case for mesodata: An empirical investigation of an evolving database system

Information and Software Technology
Fuzzy prototype model and semantic distance

Information Systems
SemGrAM: integrating semantic graphs into association rule mining

AusDM '07 Proceedings of the sixth Australasian conference on Data mining and analytics - Volume 70
An e-market framework to determine the strength of business relationships between intelligent agents

AusDM '07 Proceedings of the sixth Australasian conference on Data mining and analytics - Volume 70
An approach to argumentation context mining from dialogue history in an e-market scenario

AIDM '07 Proceedings of the 2nd international workshop on Integrating artificial intelligence and data mining - Volume 84
Towards a classification framework for interoperability of enterprise applications

International Journal of Computer Integrated Manufacturing
Using context to improve semantic interoperability

Proceedings of the 2006 conference on Leading the Web in Concurrent Engineering: Next Generation Concurrent Engineering
From data to knowledge mining

Artificial Intelligence for Engineering Design, Analysis and Manufacturing
A multi-level framework for the analysis of sequential data

Data Mining
Ontological distance measures for information visualisation on conceptual maps

OTM'06 Proceedings of the 2006 international conference on On the Move to Meaningful Internet Systems: AWeSOMe, CAMS, COMINF, IS, KSinBIT, MIOS-CIAO, MONET - Volume Part II
Automatic web service composition based on graph network analysis metrics

OTM'05 Proceedings of the 2005 OTM Confederated international conference on On the Move to Meaningful Internet Systems: CoopIS, COA, and ODBASE - Volume Part II
MDSM: Microarray database schema matching using the Hungarian method

Information Sciences: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

The relative difference between two data values is of interest in a number of application domains including temporal and spatial applications, schema versioning, data warehousing (particularly data preparation), internet searching, validation and error correction, and data mining. Moreover, consistency across systems in determining such distances and the robustness of such calculations is essential in some domains and useful in many. Despite this, there is no generally adopted approach to determining such distances and no accommodation of distance within SQL or any commercially available DBMS.For non-numeric data values calculating the difference between values often requires application-specific support but even for numeric values the practical distance between two values may not simply be their numeric difference or Euclidean distance.In this paper, a model of semantic distance is developed in which a graph-based approach is used to quantify the distance between two data values. The approach facilitates a notion of distance, both as a simple traversal distance and as weighted arcs. Transition costs, as an additional expense of passing through a node, are also accommodated. Furthermore, multiple distance measures can be incorporated and a method of 'localisation' is discussed which allows relevant information to take precedence over less relevant information. Some results from our investigations, including our SQL based implementation, are presented.