Weighted pseudo-distances for categorization in semantic hierarchies

Authors:
Cliff A. Joslyn;William J. Bruno
Affiliations:
Computer and Computational Sciences, Los Alamos National Laboratory, Los Alamos, NM;Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM
Venue:
ICCS'05 Proceedings of the 13th international conference on Conceptual Structures: common Semantics for Sharing Knowledge
Year:
2005

Citing 5
Cited 0

Fuzzy sets and fuzzy logic: theory and applications

Fuzzy sets and fuzzy logic: theory and applications
Type elaboration and subtype completion for Java bytecode

Proceedings of the 27th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Architecture of Systems Problem Solving

Architecture of Systems Problem Solving
The Gene Ontology Categorizer

Bioinformatics
Using information content to evaluate semantic similarity in a taxonomy

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 1

Quantified Score

Hi-index	0.00

Visualization

Abstract

Ontologies, taxonomies, and other semantic hierarchies are increasingly necessary for organizing large quantities of data. We continue our development of knowledge discovery techniques based on combinatorial algorithms rooted in order theory by aiming to supplement the pseudo-distances previously developed as structural measures of vertical height in poset-based ontologies with quantitative measures of vertical distance based on additional statistical information. In this way, we seek to accommodate weighting of different portions of the underlying ontology according to this external information source. We also wish to improve on the deficiencies of existing such measures, in particular Resnik's measure of semantic similarity in lexical databases such as Wordnet. We begin by recalling and developing some basic concepts for ordered data objects, including our pseudo-distances and the operation of probability distributions as weights on posets. We then discuss and critique Resnik's measure before introducing our own sense of links weights and weighted normalized pseudo-distances among comparable nodes.