Exploring dictionary-based semantic relatedness in labeled tree data

Authors:
Andrea Tagarelli
Affiliations:
Department of Electronics, Computer and Systems Sciences (DEIS), University of Calabria, Italy
Venue:
Information Sciences: an International Journal
Year:
2013

Citing 54
Cited 1

The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone

SIGDOC '86 Proceedings of the 5th annual international conference on Systems documentation
Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL

EMCL '01 Proceedings of the 12th European Conference on Machine Learning
Comparing Hierarchical Data in External Memory

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
A survey of approaches to automatic schema matching

The VLDB Journal — The International Journal on Very Large Data Bases
An Efficient and Scalable Algorithm for Clustering XML Documents by Structure

IEEE Transactions on Knowledge and Data Engineering
Introduction to the special issue on word sense disambiguation: the state of the art

Computational Linguistics - Special issue on word sense disambiguation
Distinguishing systems and distinguishing senses: new evaluation methods for Word Sense Disambiguation

Natural Language Engineering
A tree-based approach to clustering XML documents by structure

PKDD '04 Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases
Structural Semantic Interconnections: A Knowledge-Based Approach to Word Sense Disambiguation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Versatile structural disambiguation for semantic-aware applications

Proceedings of the 14th ACM international conference on Information and knowledge management
Integration of XML schemas at various "severity" levels

Information Systems
Evaluating WordNet-based Measures of Lexical Semantic Relatedness

Computational Linguistics
PageRank on semantic networks, with application to word sense disambiguation

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
XML schema clustering with semantic and hierarchical similarity measures

Knowledge-Based Systems
Yago: a core of semantic knowledge

Proceedings of the 16th international conference on World Wide Web
The Google Similarity Distance

IEEE Transactions on Knowledge and Data Engineering
Xproj: a framework for projected structural clustering of xml documents

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Structure and value synopses for XML data graphs

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Fragment-based approximate retrieval in highly heterogeneous XML collections

Data & Knowledge Engineering
Fast and effective clustering of XML data using structural information

Knowledge and Information Systems
Similarity of XML-Schema Elements

The Computer Journal
XEdge: clustering homogeneous and heterogeneous XML documents using edge summaries

Proceedings of the 2008 ACM symposium on Applied computing
Web-Based Measure of Semantic Relatedness

WISE '08 Proceedings of the 9th international conference on Web Information Systems Engineering
Word sense disambiguation: A survey

ACM Computing Surveys (CSUR)
Large scale integration of senses for the semantic web

Proceedings of the 18th international conference on World wide web
Corpus-based and knowledge-based measures of text semantic similarity

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Personalizing PageRank for word sense disambiguation

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
TKB-UO: using sense clustering for WSD

SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
Knowledge derived from wikipedia for computing semantic relatedness

Journal of Artificial Intelligence Research
Computing semantic relatedness using Wikipedia-based explicit semantic analysis

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Extended gloss overlaps as a measure of semantic relatedness

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Semantic clustering of XML documents

ACM Transactions on Information Systems (TOIS)
Finding and ranking compact connected trees for effective keyword proximity search in XML documents

Information Systems
An Experimental Study of Graph Connectivity for Unsupervised Word Sense Disambiguation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Wisdom of crowds versus wisdom of linguists – measuring the semantic relatedness of words

Natural Language Engineering
Structural and semantic aspects of similarity of Document Type Definitions and XML schemas

Information Sciences: an International Journal
Using measures of semantic relatedness for word sense disambiguation

CICLing'03 Proceedings of the 4th international conference on Computational linguistics and intelligent text processing
Semantics-guided clustering of heterogeneous XML schemas

Journal on data semantics IX
Element similarity measures in XML schema matching

Information Sciences: an International Journal
Rational Research model for ranking semantic entities

Information Sciences: an International Journal
Automatic Enrichment of Semantic Relation Network and Its Application to Word Sense Disambiguation

IEEE Transactions on Knowledge and Data Engineering
A graph-based approach to measuring semantic relatedness in ontologies

Proceedings of the International Conference on Web Intelligence, Mining and Semantics
Apples and oranges: a comparison of RDF benchmarks and real RDF datasets

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
A Web Search Engine-Based Approach to Measure Semantic Similarity between Words

IEEE Transactions on Knowledge and Data Engineering
XCLS: a fast and effective clustering algorithm for heterogenous XML documents

PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
A flexible structured-based representation for XML document mining

INEX'05 Proceedings of the 4th international conference on Initiative for the Evaluation of XML Retrieval
Transforming XML trees for efficient classification and clustering

INEX'05 Proceedings of the 4th international conference on Initiative for the Evaluation of XML Retrieval
pest: Fast approximate keyword search in semantic data using eigenvector-based term propagation

Information Systems
Word sense disambiguation based on word sense clustering

IBERAMIA-SBIA'06 Proceedings of the 2nd international joint conference, and Proceedings of the 10th Ibero-American Conference on AI 18th Brazilian conference on Advances in Artificial Intelligence
Semantic relevance ranking for XML keyword search

Information Sciences: an International Journal
A novel XML document structure comparison framework based-on sub-tree commonalities and label semantics

Web Semantics: Science, Services and Agents on the World Wide Web
Ranking semantic relationships between two entities using personalization in context specification

Information Sciences: an International Journal
Computing text semantic relatedness using the contents and links of a hypertext encyclopedia

Artificial Intelligence

Class-indexing-based term weighting for automatic text classification

Information Sciences: an International Journal

Quantified Score

Hi-index	0.07

Visualization

Abstract

The increase in the volume and heterogeneity of semistructured data based application scenarios has demanded for next-generation methods that are able to effectively couple syntactic with semantic information in data management and mining tasks. The focus of this paper is on the development of methods for determining semantic relatedness in tree-shaped semistructured data and on the assessment of the impact of these methods on structural sense ranking in such data. By exploiting key features of a lexical knowledge base like WordNet, namely ontological relations and concept definitions, we propose a twofold approach that takes into account the particular form of labeled tree data as a conceptual hierarchical representation of real-world objects. We infer indirect relationships between tag concepts and exploit an interleaved search through different concept hierarchies in order to extend semantic relatedness measures originally conceived for plain-text data to deal with labeled tree data instances. We also develop a structural sense ranking framework which employs a context graph built on the tag concepts and the structural relations among tags in the tree data. Experimental evidence on a large real-world collection of Wikipedia articles has shown that the proposed methods can effectively detect and maximize semantic relatedness in tree-structured data, and can be profitably used to perform structural sense ranking.