Natural language queries over heterogeneous linked data graphs: a distributional-compositional semantics approach

Authors:
Andre Freitas;Edward Curry
Affiliations:
National University of Ireland, Galway, Galway, Ireland;National University of Ireland, Galway, Galway, Ireland
Venue:
Proceedings of the 19th international conference on Intelligent User Interfaces
Year:
2014

Citing 12
Cited 0

Computing semantic relatedness using Wikipedia-based explicit semantic analysis

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
From frequency to meaning: vector space models of semantics

Journal of Artificial Intelligence Research
Linked Data

Linked Data
Querying linked data using semantic relatedness: a vocabulary independent approach

NLDB'11 Proceedings of the 16th international conference on Natural language processing and information systems
Getting the meaning right: a complementary distributional layer for the web semantics

ISWC'11 Proceedings of the 10th international conference on The semantic web - Volume Part I
PowerAqua: fishing the semantic web

ESWC'06 Proceedings of the 3rd European conference on The Semantic Web: research and applications
Querying Heterogeneous Datasets on the Linked Data Web: Challenges, Approaches, and Trends

IEEE Internet Computing
FREyA: an interactive way of querying linked data using natural language

ESWC'11 Proceedings of the 8th international conference on The Semantic Web
Heterogeneous web data search using relevance-based on the fly data integration

Proceedings of the 21st international conference on World Wide Web
Template-based question answering over RDF data

Proceedings of the 21st international conference on World Wide Web
A distributional approach for terminological semantic search on the Linked Data Web

Proceedings of the 27th Annual ACM Symposium on Applied Computing
Natural language questions for the web of data

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

The demand to access large amounts of heterogeneous structured data is emerging as a trend for many users and applications. However, the effort involved in querying heterogeneous and distributed third-party databases can create major barriers for data consumers. At the core of this problem is the semantic gap between the way users express their information needs and the representation of the data. This work aims to provide a natural language interface and an associated semantic index to support an increased level of vocabulary independency for queries over Linked Data/Semantic Web datasets, using a distributional-compositional semantics approach. Distributional semantics focuses on the automatic construction of a semantic model based on the statistical distribution of co-occurring words in large-scale texts. The proposed query model targets the following features: (i) a principled semantic approximation approach with low adaptation effort (independent from manually created resources such as ontologies, thesauri or dictionaries), (ii) comprehensive semantic matching supported by the inclusion of large volumes of distributional (unstructured) commonsense knowledge into the semantic approximation process and (iii) expressive natural language queries. The approach is evaluated using natural language queries on an open domain dataset and achieved avg. recall=0.81, mean avg. precision=0.62 and mean reciprocal rank=0.49.